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ABSTRACT 


Integration of application programs into a single system 
has become increaSingly important as the workstation 
environment moves toward uniformity for easier learning and 
use. This thesis proposes an Integrated Software System 
(ISS) based on the Relational Database model as a Suitable 
basis for integrating five common applications found in a 
business office. Relations, or tables, are defined as the 
common data objects and it is shown how they are used to 
support each logical application. Operations based on 
relational algebra are defined which extend the functions of 
ISS beyond the aggregate of the five chosen applications. A 
Simple graphical user interface is designed for the kernel 
of the system and a design for a kernel prototype uSing a 


Unix environment is presented. 
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Lene OOUCT ION 


The introduction of computers into the office and home 
environments has led to the development of numerous software 
packages’ to achieve many different tasks. Common 
applications such as word processing, form generation, 
database management, electronic mail and spreadsheet 
modeling have become almost essential in the workstation 
environment. The unintegrated software packages generally 
available have different operating instructions, different 
command vocabularies and often different underlying 
conceptual models of the system and its data. Data 
Structures and files belonging to one application are often 
inaccessible to another. USers must learn the fundamentals 
and conceptual model of each application Separately, even 
though many of the baSic operations in the different 
applications are essentially identical. 

An integrated software system is a software package 
which includes a number of logical applications and which 
attempts to minimize the disparity by providing a_e single 
conceptual model Of the underlying system for all 
applications. The single unified view of the system overall 
allows a user to learn the system more easily and perceive 


it as an actual single integrated system. 





eee Lee ISS 

An integrated software system combines the functions of 
its applications into a single package by using a Single 
conceptual data object and command vocabulary. The 
vocabulary includes a basic set of commands, which apply to 
every logical application, and application specific commands 
for each application. The basic set of commands perform the 
operations which are common to all applications. Since only 
one conceptual data object is defined in the system, these 
commands operate uniformly on their operands regardless of 
the application, and represent the intersection of functions 
between the logical application areas. By providing only 
one set of common operators on one data Structure, a user 
can learn to use the system easily and quickly. A user may 
become familiar with new applications by learning only a 
small additional number of application specific commands and 
functions. 

This thesis describes the design of an Integrated 
Software System, the ISS, which uses the Relational Database 
model as the underlying conceptual model, with the single 
data object being the relation. The relation is more easily 
discussed and understood as being simply a table where the 
tuples are just rows and the attributes just column names. 
Both of these terminologies will be used in this thesis. 
The ISS design integrates five applications or application 


areas: word processing, form generation, database 





management, electronic mail and spreadsheet modeling. A 
table format is designed to Support each application area 
and the intersection of functions is described by defining 
the primitive operators which operate on all tables and on 
which applications and utilities may build to effect their 


particular transactions. 


Pee lrHeES ITS OBJECTIVES 

The first objective of this thesis is the design of the 
system of tables to support the ISS. Data tables and system 
tables were designed also to Support the concurrent research 
into multilevel security (Ref. 2]. 

Secondly, a set of primitive table operators and general 
system commands is defined which form the kernel of the ISS. 
These operators may be used by a Single user and by 
applications to manipulate data tables in the desired 
fashion. A set of extended binary operators is defined 
which is traditionally used in database applications, but 
which is useful for other applications as well. Thirdly, a 
Simple graphics oriented user interface for the ISS is 
defined. 

Finally, a prototype implementation strategy for the 
data table object and the primitive operators is described 
for the Unix environment. This crude design when 
implemented may then be used to prove the utility of the ISS 


to support user needs in an operation environment. 
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The work in this thesis is a refinement and extension of 
the work carried out by Nishimura in his thesis, "Analysis 
Of The Relational Database Model In Support Of An Integrated 
Applications Software System." (Ref. 1] The tasks to re- 
evaluate the logical data bases and basic operators, to 
design the basic user interface and, to design a simple 
prototype were chosen from Nishimura's suggestions for 


follow-on research. 


fee ltHeolS OVERVIEW 

The system described in this thesis is the conceptual 
model of the ISS, its data object and operations. The 
design of tables and basic operators are complete in that 
they can support all the application areas. As described in 
Section B above, the objectives of this thesis are _ to 
address this kernel to ensure its completeness and 
consistency with the overall conceptual model. This model 
is developed as a "Single uSer at a time" relational 
database model. 

The intention is to develop the conceptual model for the 
user without regard to physical realities which must be 
conSidered in implementation of the system. Despite this 
Seecempt at maintaining simplicity, a certain amount of 
knowledge regarding fundamental relational database theory 
is assumed, particularly regarding the material in chapter 


mieeama section C of chapter IV. 
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Provision is made for multilevel security enforcement by 
the inclusion of special tables and fields for that uSe and 
are included for the furtherance of concurrent research 
(Ref. 2] into multilevel Security. Essentially multilevel 
security requires document access control by Security 
classification, special group membership or job status (by 
compartment and caveats), or by explicitly mamed access 
ists. This access control must be enforceable at a 
document and paragraph level and when combined with 
generally accepted database controls on column values, 
amounts to the requirement of controlling access to table 
entries by column and e “row. In addition a table of 
legitimate users is required to be kept which includes’ the 
user's clearances and other covers. The maintenance of 
multilevel security in any system is a complex problem and 
is beyond the scope of this thesis. The framework for such 
Controls is included, but the issue is not developed 
further. Interested readers may consult the references for 
Further details of multilevel Security requirements. 

Chapter II surveys related work in the relational 
database (RDB) field, in particular the analysis which was 
done to show the feasibility of using the RDB model in an 
integrated environment. Other research efforts are 
discussed which bear little direct impact on the conceptual 
design of the ISS but which may be helpful in subsequent 


work on implementation and application development. The use 
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of the RDB model for applications other than those chosen 
for the ISS is briefly described. 

Chapter III describes the set of tables which make up 
the ISS: the system tables required to maintain the ISS, the 
data tables of each application, and how they both may be 
used to support their applications. 

Chapter IV defines the abStract interfaces of the 
primitive operators available to manipulate data tables, and 
how they may be used to realize the basic operations common 
to all of the applications. These operators form the Kernel 
on which applications may be constructed. A further set of 
combining operators is defined which is commonly found in 
database applications but which are shown to be of uSe in 
the other application areas and in some cases in combining 
tables from different applications. 

Chapter V presents a simple, graphics oriented user 
interface for the kernel of the ISS. The general system 
operation and the form of the basic operators are defined in 
this design. The deSign resembles a form filling process 
Similar but simpler than IBM's Query By Example database 
Query language. Commensurate with the goal of uniformity 
for the ISS, this design is intended to provide a basis for 
the design of the commands in each application. 

Chapter VI describes a simple prototype implementation 
Strategy for the kernel primitives and general system, using 


the Unix programming environment. jige describes the 


Ie 





methodology which can be used to write very simplified I/0 
modules to map the designed user interface into the abstract 
interfaces of the lower level modules. 

Finally, chapter VII presents our remarks, conclusions 


and suggestions for follow-on research. 
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II. RELATED WORK 


In this chapter we discuss previous work in application 
systems which use a Relational Database (RDB) as_ the 
underlying model. Section A describes in detail the initial 
design of the Integrated Application Software System which 
motivated this thesis. Section B discusses previous work on 
editing and relation browsing, general material describing 
desirable features of editors, and how relation browsers and 
editing may be implemented in the Relational Model. The 
final section describes briefly other applications for which 


the Relational Model can be used. 


A. INITIAL DESIGN 

The premise of this thesis proposal is an extension of 
the research efforts presented by Nishimura, [Ref. 1]. In 
the initial design phase, Nishimura evaluated the utility of 
the Relational Database model to conceptually integrate the 
Five application areas of interest: EGXt processing, 
relational database management, form generation, electronic 
mail and spreadsheet modeling. 

Motivation for the thesis research stemmed from_ the 
realization that each of these five applications perform 
Similar functions and are conventionally utilized as a non- 


integrated collection of application software. In this 


Ihe: 





non-ISS, each application. implements operations on 
dissimilar file types and the data among the files are not 
directly sharable. In addition, in order for the user to 
utilize them, he or she is required to learn a large number 
of commands for each application, many of which are 
synonymous. If the commonality of the application areas 
could be collapsed into one application like the kernel of 
an ISS, this would significantly reduce the redundancy and 
minimize the command vocabulary needed to utilize the 
applications. 

After choosing the relational database model as_ the 
kernel for the ISS, it was decided that the most natural 
data object to use in the system was the table. In the ISS, 
each of the five applications is a logical database 
consisting of a set of tables. The set of tables are of 
three classes: Application Directory, Data Table Schema, and 
Data Table. Each column in the table represents one 
attribute of the file and each row represents a unique 
occurrence. The tables include columns which represent’ key 
values to uniquely identify each row. Any datum ina table 
can be accessed by specifying the name of the table, the 
value of the key, and the name of the attribute containing 
the datum. 

The Application Directory Table contains descriptive and 
definitional data about the data tables in an application or 


logical database. Each row in the Application Directory 
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Table describes one data table and has a standard schema. 
For each application, the schema of the Application 
Directory Table can be augmented to include additional data 
table attributes. 

A data table represents the logical file of an 
application. The data tables are typed in accordance with 
their primary use, 1.e., text, form text, database, spread 
sheet, and mail. The data tables are categorized by types 
in order to logically organize those which are used 
primarily by the same application. Since one of the key 
objectives of an ISS is to be able to share data, strong 
data table typing is not a feature in the design. 

The Data Table Schema Table contains a row for each 
column in a Data Table. In some Sense this schema represents 
variable declarations in a conventional programming 
language. 

Each row in the Application Directory Table is linked to 
a set of rows in the Data Table Schema Table and a Data 
Table. The same relationship among the tables exist for 
each application. 

Since the Application Directory Table of each 
application is identical, no further discussion of that 
table type is necessary, however the data tables are 
different. All data tables in one application, except the 
database application, have the same structure. All tables 


have an ID column. The Text Data Tables have just one other 
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Solum, TEXT: LINE which is the text. Form data tables 
Similarly have two columns except the second column, FORM 
LINE is meant to be used differently by the form processing 
application. A Mail Data Table has eight columns which in 
each row describe the origin and destination of each 
message, and hold the message itself or pointer to a large 
message body. Spreadsheet Data Tables have six columns So 
that every row describes an X,Y coordinate pair for the 
conventional spreadsheet view. The Database Tables are the 
only ones without a predetermined Structure, they have the 
ID column and any number of other columns’ required by a 
user. 

In order to realize the main objectives in an ISS, one 
must present the user with a Single conceptual view of the 
System regardless of the context of its use. The 
practicality of this objective was demonstrated through the 
use of tables as described above. In addition, the user 
must be provided with a common set of table operators as 
well as a set of application dependent operators. The 
intersection of operations proposed to manipulate data in 
the files is comprised of the locate, insert, modify, 
delete, copy, and move operations. Nishimura gave 
conceptual descriptions of how the six basic ISS primitives 
Can perform the operations in the functional intersection. 
The primitive operators suggested (insert, modify, delete, 


Project, select, and union) are based on relational algebra. 
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In the initial design phase it was shown how these operators 
could be utilized on each application to yield the same 
result as all conceivable queries in a non-Integrated 
Software System environment. 

It is the intent of this thesis to refine this 
conceptual design and view. The refinement will ultimately 
prove the feasibility of these concepts by implementing the 


fundamental ideas presented herein. 


B. TEXT EDITING AND RELATION BROWSING 

The need to edit and browse relations is important to 
each of the applications in an Integrated Software System. 
Memeamg of text particularly is the subject of much 
research, both psychological and technical, to determine the 
qualities of a good editor. 

An ideal editor is a subjective term determined by the 
needs of the user, however the following general features 
may be identified as desirable: [Ref. 3] 

(l) A consistent conceptual model of the the system 
with a clear and concise user interface including on-line 
help and documentation. 

(2) Powerful facilities that take advantage of computing 
power, with an "infinite undo" capability. 

(3) An ability to edit a format which closely resembles 
the intended target format. (Whether it be intended for 


Demaecopy or not.) 
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(4) Fast and visible responses to all commands however 
Peavial or complex. 

(5) Access to Shared information and other contexts on 
the same display surface without leaving the editor, and the 
ability to access other applications or to be part of a 
larger integrated environment. 

(6) The ability to edit mixed targets such as _ text, 
forms, programs and data. 

The above points are worthy goals for editor. and 
browSing interfaces and have been considered in our system 
design. 

A Browser, aS itS name implies, is a software facility 
which allows the user to "browse" through information 
related to an application. TIMBER (Text; = lcon, and Map 
Browser for Extended Relations), as discussed in [{Ref. 4], 
is the design of a sophisticated user friendly, graphics 
oriented browser for a relational database. 

TIMBER was designed to meet the objectives of four 
application areas. TE€~ proyides user interface to a 
relational database system that can be: 

(1) a relation browser for fixed format relations; 

(2) a sophisticated browser for relations with icons; 

(3) an editor for text data stored in relations; 

(4) amap browser for geographical data. 

The designers of TIMBER- propose that each of the 


aforementioned applications virtually perform the same 
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function and thus can be served by an Integrated Software 
system. 

The basic concepts of TIMBER are: a window on a 
Sophisticated graphics terminal; a relation in a database; 
and a cursor. The screen of the terminal can be split into 
several rectangular windows to which relations are bound. 
The tuples from the relations appear in the windows and can 
then be manipulated by TIMBER commands. The cursor on the 
screen is controlled by a bit pad and a mouse and many 
TIMBER commands affect the window in which the curSor is 
currently positioned. 

The relations of TIMBER are of four types, each of which 
has a default Screen format and coordinate system. The 
types are: normal fixed format, text, icon, and map. There 
are three ways in which the uSer can alter the contents of 
the windows on the screen. One can move the cursor, use the 
ZOOM feature, or use the six relational commands ak becting 
the window. 

TIMBER is composed of six major modules: Application 
Program, Intelligent Buffer, High Graphics, Low Graphics, 
Smart CRT, and INGRES. INGRES, however, does not support 
TIMBER without several added extensions. First, ICON must 
be added as a new data type. Second, the notion of ordered 
relations must be incorporated in INGRES in order to support 
the storage and manipulation of documents. Finally, there 


1S a need to incorporate an efficient concurrency control 
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system which will enable multiple concurrent browsers which 
perform updates. 

In summary, the design of TIMBER, which is intended as a 
sophisticated two-dimensional graphical browser for text 
relations, fixed format relations, and relations containing 
icons or mapS, provides several impressive features which 
Support the use of the relational database model as_ the 
kernel of an ISS. 

Several research projects including those mentioned 
below (Section C.) have dealt with aspects of text editing 


and Storage within the relational model, but the most 


extensive work in this area has been published by 
Stonebraker and various co-authors. They have considered 
the requirements of document processing and _ proposed 


enhancements to RDB management systems to Support two text 
formats and a capable editor. [Ref. 5] Stonebraker's 
proposal is to provide facilities in a RDBMS which can 
Support calls from a text editor running as an application 
program 

The two schema for text format are: 

(1) a binary relation with attributes line number. and 
and a text line, and 

(2) a ternary relation with attributes sentence, word 
number and text word of some maximum length. 

Each scheme has an advantage for certain relational 


Operations and the authors suggest that both are appropriate 
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and that facilities are required to transfer from one _ to 
another. The first enhancement is the inclusion of variable 
length strings as table column types which may be stored as 
variable length fields or external to the database with 
Memmeers to their location. Since the order of these lines 
of text (and the order of the words in the case of (2) 
above) is important, Stonebraker describes the mechanism of 
ordered relations in which the database manager effects the 
ordering using unique tuple identifiers (TIDs) which are 
assigned by the system to each new tuple added to the 
relation. An ordered B-tree access structure is kept in 
which TIDs are at the leaves in the intended ordering on the 
tuples and each internal node holds the number of TIDS in 
each of its subtrees. The TIDs then act as keys into the 
relation containing the text, in which the text lines are 
unordered. Access to a specified line number is easily done 
by keeping count while traversing through internal nodes, 
Selecting the appropriate subtree to find the correct leaf. 
Insertion and deletion is also easy, simply updating the 
internal nodes of the access structure and inserting or 
deleting the TID at the leaf, using standard B-tree 
algorithms. A generalization of this structure is described 
for format (2) above where the leaves contain a pointer to a 
Second B-tree holding the ordered words within the ordered 


Sentences. 
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Stonebraker introduces the idea of "extended wild card" 
substitutes for any character string, which may be used in 
qualification and target lists of replace and substitute 
commands. Essentially each extended wild card of form *i 
(for i in some integer range), once matched to a substring 
in an expression retains the value of that substring and so 
may be used in the target list. 

To support other substring operations a simple substring 
operator is described which can select substrings between 
bounds set by poSition or content. A "break" operator 1s 
defined to perform the transformation of text in form (1) 
above to form (2), and a more generalized concatenation 
operator described for the reverse transformation. 

Stonebraker carried out some relative performance 
evaluations of an editor running as an application program 
on the Relational Database Management System INGRES- under 
UNIX, against the performance of the UNIX line editor ED. 
His results may be found in the reference, but they were 
Sufficient to indicate that indeed a Relational Database 
system could be useful in the processing of text in ordered 


relations. 


C. OTHER RDB MODEL APPLICATIONS 
With the maturing of database technology, there has been 
€@ growing awareness of its usefulness in applications other 


than data processing. Areas such as Office Information 
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Systems, Engineering Design, Programming Environments, 
Operating Systems and recently, Artificial Intelligence, 
have been considered as areas where the Relational Model 
will be useful. 

A recent approach to Automated Business Procedures has 
been the development of a "forms" oriented design of an 
integrated office, including word processing, mail and 
office communication tasks. [Refs. 6 and 7] 

Essentially forms are stored in a Relational Database, 
each table representing a form type with each tuple an 
instance of a raised form. Included in tuples can be fields 
containing HiGdiiamintormatklon, sana Of Course integrity 
constraints can be applied to signature and special data 
fields. Each form type has its own display format which may 
include the details currently shown on printed forms, or be 
aS Simple as just one large text field for data entry for a 
text file. The display format for general users would not 
include audit information and is actually a view of the 
database. Forms can be used to display forms, enter data, 
perform word processing, and execute queries with the 
underlying RDB Management System providing the necessary 
facilities. Queries can be expressed by general users using 
the display views much like QBE, and more detailed queries 
may be framed on the conceptual database for audit and 


Statistical purposes. 
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The increasing size and complexity of designs, in 
particular VLSI, and the problems of managing their data, 
has led to much research on the use of Relational Database 
Management Systems for these purposes. In doing so it is 
felt that extenSions are required to the RDB model to suit 
the applications. Stonebraker, Rubenstein and Guttman [Ref. 
8] discuss the inclusion of abstract data types (structure 
and operations) in INGRES and the use of extended indices 
for them. They propose that these extensions will help to 
improve the usefulness of database systems in Computer Aided 
Design applications. 

Lorie and Plouffe [Ref. 9] discuss extensions to System 
R to make it more appropriate for engineering and design 
applications. They describe a method of storing complex 
objects, each spread across a number of different relations 
in which each component of an object is related to a 
hierarchical parent component in a tree structure. They 
also describe a mechanism for the sharing of design objects 
within a design environment which allows for the relatively 
Slow transaction time in a deSign activity as compared to a 
Straight forward data procesSing transaction. These ideas 
have also been developed by IBM's Yorktown Heights Computer 
Science Research Division [Ref. 10} and include designs for 
the storage and retrieval of arbitrarily long fields for 


both simple and complex data objects. 
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Representation of programs in the Relational Model = and 
using that model to create, store, debug and manage program 
information is the primary goal of the OMEGA programming 
environment. [Ref. le) An editor/query processor is 
designed on top of a Relational DBM system which gives a 
unified view of program and data and provides very powerful 
facilities for debugging and monitoring execution of 
programs. In this project they have found it necessary to 
@ea@evdriable length strings and ordered relations to the RDB 
model, aS well as three extensions which support the 
programming environment. 

A new approach to Operating Systems has recently been 
Suggested [{Ref. 12] which uses an underlying RDB to present 
a unified conceptual view of the system to both user. and 
systems programmer. An operating system can be implemented 
as a large integrated package uSing a Relational DBM system 
aS a kernel. Interaction with such a system would be by 
access and manipulation of system :srelations using a command 
language based on a relational query language. The 
inclusion of a transaction mechanism to abbreviate commonly 
used commands would add a richness to the command language 
while retaining the conceptual view of the operations. The 
authors suggest that a graphical database query and update 
language like QBE would provide a flexible, easy, powerful 


and friendly user interface to such an operating system. 
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There has been recent interest and discussion [Ref. 13] 
on the possible uses of a relational DBM system as the model 
for Artificial Intelligence expert system Knowledge bases. 
The Similarity between stored data and "facts", the 
potential to include in a database an inference mechanism, 
and the possible use of some sort of trigger to emulate 
"rule firing" are questions which have been posed but not 


yet researched. 
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fit. Some CClLURES AND USE OF ISS TABLES 


The set of tables chosen for the design of this system 
is an extension and refinement of those described in Chapter 
II, including extra system tables for system use to. provide 
a mechanism capable of supporting multilevel security. This 
full set of tables are those designed for the conceptual 
view of the system overall using the relational database 
model and need not necessarily be implemented exactly as 
described. 

Within each of the five application areas there are four 
classes or types of ttables: Data Tables, a Directory, 
Paragraph Classification Tables (PCTs) and Access Control 
Tables (ACTs). The data tables for each application are the 
tables created or used by the user while the others can _ be 
considered system tables for mostly system use. Although 
the data tables in each application area differ, the system 
tables have similar structure and function in each of the 
five areas. In addition to the above tables which make up 
the logical databases of the five applications, there are 
three table types which are used by the ISS kernel: ot ee 
Users Login Table, (2) a Capabilities List Table for each 
user and, (3) a Schema Table which describes every unique 


attribute or column of every table in the system. Of the 


INS, 





above tables the Login, PCTs, ACTs and CLTs are used by the 
system to enforce security. 

The sections of this chapter describe the structure = and 
@emeral use of the above table types, first dealing with the 
meetem tables and then each of the five application area 
data tables: text, forms, data, electronic mail and 
spreadsheet. The Schema Table is described first and then 
its format is used to describe the Structure of every column 
Meal fixed table types. The ISS must be installed with 
the schema rows describing the columns of system tables and 
the columns in the four standard table types for text, 
ZoOrmMs , spreadsheet and mail. Schema Table rows’ for 
previously undefined columns in Database Data Tables will be 
added as users create the new database tables. 

Reference and retrieval of tables and tuples by the 
System is by name, with the advantage that this provides the 
most eaSily understood interpretation of pointers. 

In this thesis we will not discuss the recursive 
problems encountered when those tables provided for 
multilevel security are, themselves, treated simply as data 
tables requiring coverage by further security controls. The 
Special multilevel security tables (ACTs, PCTs and CLTs) 
have been included to provide a framework for future 
research and are not dealt with in later chapters on 


implementation strategies. 
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A. SCHEMA TABLE 
The Schema Table is a single table which contains a _ row 


for each different column (attribute) name throughout all 


tables in the ISS. The system uses’ the schema in 
conjunction Wlenweeae Couumm Listing im a directory to 
determine the structure of a table it is about to 
manipulate. The Schema Table columns are illustrated in 


figure 3.1 and are described by tuples from the table itself 
as shown in figure 3.2. 

The ID column is a six digit integer which is a field 
simply representing the conceptual ordering of the rows in 
the data base and their display order as a table. The 
underlying physical system (implementation) need not store 
the relation in this fashion. It is attached to all tables 
in the system and corresponds to the record number in 
systems such as DBASE II. It will not be described again 


for other tables. 


pid | NAME | TYPE {| WIDTH | SYNONYM /| TABLE 


Figure 3.1 - Schema Table Schema 


The NAME column is Simply the textual name of a column 
which may appear in many tables of one or more types or a 


Single table such as some database table. "ID" and "NAME" 
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are typical values which may be found in this column. TYPE 
and WIDTH simply describe the physical data type and maximum 
size with the convention of "0" meaning “of varying length". 
SYNONYM is a column which gives the names of columns 
throughout the system which have the same characteristics 
and may be considered to carry compatible data under certain 
circumstances. TABLE gives a particular table name or type 
of table in which the column being described may be found. 
A simple literal denotes a particular table, a literal 
preceded by "=~" indicates all of a particular table type or 
class. For example the "“ALL" in the TABLE column of 
figure 2.2 indicates the ID column is in all tables. 
Similarly -TEXT in a TABLE column would indicate the column 
being described by that row is in all Text Data Tables. 
Similar meanings are attributable for the other fixed 
Structures, -FORM, -MAIL and -SPREAD. 

The Schema Table is used by the system for retrieval of 
tables and by a user while creating a new Database Data 
Table. (The structure of tables in the other applications 
are predetermined and are described in sections G through 
K.) In the process of creating a new database table, its 
Structure will be defined as the uSer describes each new 
attribute of the new table by appending a new row to. the 


Schema Table. 
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IB; NAME deere: WIDTH SYNON YM TABLE 


Ce na ee ee ee ee 


Figure 3.2 - Self Describing Tuples in Schema Table 


EeeeelooUSERS TABLE 

The ISSUSERS Table is used by the system to establish 
identities and security clearances of the user. The table 
has one row for every logical user of the ISS and should be 
accessible only to the database administrator (DBA). The 
table schema is illustrated in figure 3.3 and the columns 
are described by the appropriate rows from the Schema Table 
memeragure 3.4. 

The USER NAME is simply the name of a logical user. and 
need not be unique since the key to the table is the USERID 
which must be a unique alphanumeric string for each logical 
Ser. PASSWORD iS an encoded password and each of the 
fields with application names is a logical indicator of a 


user's permission to access the directory and data tables in 


8) 8) 


| [ ID [INTEGER| 6 | [ALL 
re cre 
ee eg | cre 
ea 4-71 a a 
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that application. CLEARANCE is the users security clearance 
and CC is aerecord of the user's compartment and caveat 
status in a multilevel security environment. ROLES is a 
field which may be used to denote special functions a user 
may have such as database administrator, or reviewer in a 
multilevel security system. CAPABILITIES is the name of the 
capabilities list table associated with that user and is 
@escribed in section F of this chapter. 

When a user invokes the ISS, the system retrieves the 
appropriate row from the ISSUSERS table and conducts a 
Standard password procedure. Enhancements may enable the 
System to allow a limited number of password entry failures 
and then lock the device or initiate some alarm procedure. 
The users clearances, compartments, roles and access rights 
are retained as system variables throughout the ensuing 
session and can be used for access control during all 
transactions. 

Insertion of new users may be done by the DBA by 
appending new tuples to the ISSUSERS relation. Similarly 
Clearances, compartments and roles of the uSersS may _. be 
Changed by the DBA by modifying the appropriate values in 
the desired row. It may be desirable to include in the 
System a routine or view mechanism to allow users to change 
their own passwords, since a one way encoding step must 


first be applied to a password before storage. 
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| ID | USER _NAME | USERID| PASSWORD | TEXT [| DATABASE 


FORMS | 
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MAIL | 
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Figure 3.4 - Schema Table Rows For ISSUSERS 
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fee APPLICATION DIRECTORY TABLES 

A Directory Table exists for each of the five 
application areas of the ISS with each row in_ the 
application directory describing exactly one data _ table. 
These directories can be used by users and by the system to 
find particular data tables which exist for the selected 
application. The schema for the Directory Tables is 
illustrated in figure 3.5 and each of the columns is 
described in figure 3.6 by the appropriate rows from the 


Schema Table. 


| ID | TABLE NAME | COLUMNS| KEYS | O CLEAR | 0 CC 
| | | | | | 


VIRTUAL | CONDITION GLOBALS | OWNER DESCRI-| 
| | | (eos 0) 


Figure 3.5 = Directory Tables Schema 


In each row of a directory, TABLE NAME is the unique 
name of a data table which is being described by the row and 
is the pointer to that table in the underlying relational 
database management system. COLUMNS is a list of the 


columns in the data table, and with the Schema Table tuples 
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for those columns, completely describes the structure of the 
data table. KEYS is a field containing the names of those 
columns which comprise the key to a table. O CLEAR and OCC 
are fields giving the overall security level, compartments 
and caveats of a table. M CLEAR and M_CC give the minimal 
requirements to access some parts of the table, particularly 
Text Tables, where low security paragraphs in a. high 
security document may be made accessible to those with the 
lower security clearance. PCT POINTER and ACT POINTER are 
the names of the two tables which control column and_= row 
wise access to every data table. REVIEWED and REV _REQD are 
logical values required by organizations peacere md 
multilevel security. VIRTUAL is a logical field indicating 
if the table is composed from other ISS tables, and if true, 
then CONDITION is’ the deScription of the operations to be 
performed to realize the table. GLOBAL is a etext string 
which may be used to contain print formatting or display 
mode data, or other parameters useful to the system. OWNER 
is Simply the originator's userid and DESCRIPTION is a short 
textual description of a table. 

The Directory Table may be used by a user to view the 
directory for any one application, and by the system to 
determine the structure of any retrieved table. Tie ee 
field is always a key field for any table. The freedom to 
select a key exists only for database data tables’ since 


other table structures are fixed. The setting up of access 
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Structures (indices) uSing these keys is of no consequence 
to the conceptual model and is not considered in this 
fMiesis. During display of a directory, information in the 
security fields may be used to filter the rows displayed so 
that users without sufficient clearances or need to know 
will not discover the existence of tables that exceed their 
clearances. During any call for a particular table the 
directory may be consulted to confirm eligibility for access 
to that table on the grounds of security clearances, 
compartments and caveats. 

The GLOBALS field may contain application specific 
information in textual form, for example a text application 
may place in the GLOBALS field of the directory entry a page 
memgem fOr printed format by including .pl 60, or a page 
Meader .ph "NPS THESIS". A Mail Directory Table row can 
contain the name of the Mail Data Table owner, and possibly 
Meaqirection instructions. In a spreadsheet GEeector v 
GLOBALS may be used to store recalculation order. In any 
case, the information in GLOBALS is related specifically to 
the application So each application must have the 
intelligence to retrieve and use the tokens in this field. 

Creation of new data tables is done conceptually by a 
user creating a new tuple in the directory or conversely 
Creation of a table causes a new entry to be placed in the 
directory. ZHonOouUld woe noticed that with the full 


multilevel security controls this would imply the creation 
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also of 


sections. 


the two access tables to be described in following 


Figure 3.7 shows an example of a single row of 


text directory. 
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Figure 3.6 - Schema Table Rows For Directory Tables 
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ID TABLE NAME COLUMNS KEYS 0 CLEAR OCC 


| | RECIPEL Dee xml L NE ID [— UNCLAS | 


SpE i i = |i ae EEE 0 (SER EE ee 


M CLEAR M CC ACT_NAME PCT_NAME REVIEWED REV_REQD 


UNCLAS | | ACT1234 [| PCT1234 | TRUE | FALSE 
| 


VIRTUAL CONDITION ~GLOBALS OWNER DESCRIPTION 


FALSE | PeoleoO;cim 10° | 2516P | Brownies | 


Figure 3.7 - Example Row From Text Directory 


BeeeACCESS CONTROL TABLES 

Access Control Tables (ACTS) are tables which are 
associated with every data table in the ISS. To satisfy 
multilevel security and need to know arrangements the ACT 
mired Particular data table lists explicitly the userids 
allowed to access that table and what rights each userid has 
concerning the table overall, the tuple level or on 
individual columns. The ACT schema is illustrated in figure 
3.8, however the number of columns depend on the number of 
columns in the data table being described since the 
attribute column is repeated for each column in the data 
table. ACTs related to data tables within one of the text 
processing, form generation, spreadsheet or electronic mail 


applications will all be the same structure since data 
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tables in one of these applications have a fixed number of 


columns. 


Pow UGERID | MODIFY.,{ TUPLE | ATTRIBUTE| ATTRIBUTE] ~ 


en ee ee ad 


Figure 3.8 = Access Control Tables Schema 


Figure 3.9 describes the structure of the three as_ yet 
undescribed attributes of the Access Control Tables. The 
USERID has already been described in figure 2.5 except that 
the TABLE field for that row should now read "ISSUSERS ,- 
ACT". MODIFY 18 a boolean indicating a users right’ to 
modify the schema of the data table. This will generally be 
NO (or FALSE) since the Structure of all application tables 
except Database Tables are essentially predetermined by the 
design of the system. TUPLE is a field containing up_ to 
four characters indicating the users right to insert(I), 
delete(D), read(R) or update(U) £entire tuples. The 
repeating ATTRIBUTE columns may contain read(R) or update(U) 
controlling each individual's access to the columns of a 
data table in the order they are listed in the directory. 
Explicit TUPLE column values of read or update imply the 
value for all the individual attribute columns, for example 
a TUPLE update(U) implies a U in all of the attribute 


columns whether they are there or not. 
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ID NAME LLee WIDTH SYNONYM TABLE 


| [MODIFY [BOOLEAN| 1 | fh eACT 
| | | | | | 

| [TUPLE | CHAR | 4 | | -ACT 

| | | | | | 

| [ATTRIBUTE | CHAR | 2 | -ACT 

| | | | 


| Ge eee Ee eee OS SS Se See ee eee 


Figure 3.9 - Schema Table Rows For Access Control Tables 


The ACTs are useful to enforce views and multilevel 
security requirements. When accessing a data table via the 
directory the number of columns in the data table and the 
name of the associated ACT is retrieved. The system must 
then enforce the restrictions imposed by the ACT during use 
of the data table. If an all users (USERID = -ALL) entry is 
not in an ACT then users who are not explicitly on the list 
have no access to the data table with which the ACT is 
associated. An individual's rights to a file may be 
determined by more than one entry, for example -ALL may read 
tuples and 25l6p in addition may update tuples. This 
mechanism may be used in multilevel security to enforce need 
to know lists of personnel for particular documents. 

Although every table must have an associated ACT, there 
ls no requirement for the ACT to be unique provided it 
describes the correct number of columns. For example within 
an application one ACT may be used as a simple owner write, 


all others read access table with which many data tables are 
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associated. Figure 3.10 is an example of such a table which 
may fe named in the ACT PTR field of any text directory or 
forms directory row because it has two attribute columns. 


(One for ID and the other for TEXT-BODY or FORM-BODY.) 


ED USERID MODIFY POPE ATTRIBUTE ATTRIBUTE 


Aes ~S a etp, =—— “i Smee ee eee 


eC EE (NESSES eee, ee ee 


Figure 3.10 - Example Access Control Table 


In this example the hyphen is used to designate special 
userids such as owner and all users. 

Discussion and implementation strategies for these 
tables are left for further research into multilevel 


security issues in an ISS. 


E. PARAGRAPH CLASSIFICATION TABLES 

The Paragraph Classification Tables (PCTs) are designed 
to provide access control, primarily of Text Data Tables, at 
@ paragraph level as required for multilevel security 
systems. Since a paragraph may consist of a single line of 
a Text Table, essentially line by line control is required. 

Figure 3.11 illustrates the schema of PCTs and figure 
3.12 shows the Schema Table row for the OFFSET column which 


has not already been described in prior sections. 
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[NED e|SORECETOINGEEARANCE | cc | 
| | | | | 


Bigune 3s) 9 Paragraph Classification Tables Schema 


Each data table is covered by a PCT although as in the 
case of the Access Control Tables, the PCTs need not be 
unique and may cover many data tables. The Paragraph 
Memerol Table for each data table is named in the PCT NAME 


Field of the directory entry for that data table. 


ID NAME Wyeae ls WIDTH SYNONYM TABLE 


| VORB O&. |INTEGER| 6 | ID | =-PCT 


The table is ordered on the offset field, the first and 
possibly Ga ivemeuple “being “for an OFFSET of il. The 
clearances, compartments and caveats for an offset represent 
those placed on the material in the associated data table at 
that ID and greater, until superceded by another row in the 
Eel, Dur ing retrieval and manipulation of tables, 
particularly text, only those rows of data which are less 
than or equal to the userid's security levels and clearances 


will be made accessible. 
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Figure 3.13 shows an example of a simple paragraph 
control table which will be pointed to by name from the 


directory entry for some text data table. 


ID OFFSET CLEARANCE EC 
pee * 1| UNCLAS | | 
| 05 | SECRER | 
a 36 | UNCER |" 


Figure 3.13 - Example Paragraph Classification Table 


The PCT indicates that IDs (lines of a text document) 1 to 9 
are unclassified, 10to 19 are secret and 20 up to the end 
of the document are unclassified. Should the user be 
allowed access to the data table by the controls described 
in the previous Section, and have a clearance less’ than 
secret, then he will be able to see the unclassified part of 
the document. As with the Access Control Tables, strategies 
for implementation and use of these PCTs to implement 


multilevel security measures are left for further research. 


Mec APABILITIES LIST TABLES 

Capabilities List Tables (CLTs) are a multilevel 
security table associated with each userid in the ISS system 
and will be named in each row of the ISSUSERS table in the 


CAPABILITIES column. (See figures 3.3 and 3.4). 
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| ID | TABLE NAME | TYPE | OWNER | 
| | | | | | 


Figure 3.14 - Capabilities List Tables Schema 


The information they contain is sufficient to determine 
immediately all tables to which any userid has access. It 
has been suggested [Ref. 2] that such tables are required in 
a multilevel security environment, although the data they 
contain is mostly redundant. One Solution may be to provide 
the tables as virtual tables. Figure 3.14 illustrates the 
schema of the CLTs. All the Schema Table rows’ for. the 
columns have already been described in previous figures 
except that now "-CLT" (all CLs) must be added to the TABLE 
and OWNER columns in the appropriate Schema Table rows to 
indicate that the columns are also uSed in the Capability 
List Tables. 

For each userid in the ISSUSERS table the CLT is simply 
a list of tables to which he has access before security 
levels, compartments and caveats are token into account. 
The CLTs are a redundant method of enforcing need to know 
restrictions and requires positive input from the creator of 
any data table. Mechanisms to maintain this list would 


Clearly be linked with the multilevel security precautions 
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used to create and manipulate tables and are not discussed 


further in this thesis. 


fe TEXT DATA TABLES 

The Text Data Tables contain non-formatted textual 
information. This data can be used in a myriad of 
applications, including preparation of documents or computer 
programs, or as textual information to be combined with 
other application data tables by one or several of the ISS 
operators. 

The Text Data Table contains two attributes, ID and 
TEXT LINE. The schema for the ID has’ been previously 
defined. The schema for the TEXT LINE attribute is shown in 
meaure 3.15. The "0" in the width field indicate that the 
TEXT LINE is of varying length. As depicted, the FORM_LINE 
attribute of the form data table and the BODY attribute of 
the MAIL data table can be aliased with the TEXT _ LINE 


attribute of the Text Data Table. 


ID NAME TYPE WLDTH SYNONYM TABLE 
ie | TXT LINE | CHAR 0 | FORM LINE [-TEXT 
| { | | | BODY | 
| 


Figure 3.15 - Schema Table Row For Text Data Table 
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Figure 3.16 shows an example of a Text Data Table. Each 
Mewes is Uniquely defined by the ID and TEXT_LINE. Any type 
meeormation can be entered into the TEXT_LINE and all the 
data, (except the ID), will appear on output in the same 
format in which it was input. (Note that row 3 of figure 
3.16 contains the characters ".sk 2". This character 
sequence will have no special meaning in the context of the 
ISS Text application. However, the Text Data Tables can be 
used as input to a TEXT FORMATTER APPLICATION SYSTEM in 
which these characters or similar syntax would have a 
special meaning, i.e. "Skip two lines"). The key issue is 
that all user input into the Text Data Table will be 


operated on by ISS as pure textual or literal information. 


ID TEXT LINE 


1 [| What can be better 


than. this? 


| | 
|| | 
eee, | | 
=e, | 
(wont nesk 2 
| | 
| 4 | DON'T ASK! | 
| | | 


Figure 3.16 - Text Data Table 


H. FORM DATA TABLES 
The Form Data Tables are a special set of Text Data 


Tables. They act as a window or view into the ISS Database 


48 





Data Tables and may be used for entry or extraction of data. 
Forms are generally used for repetitive tasks such as the 
Mepanting of letters with the same or Similar bodies but many 
different addressees. Forms may also be used in order to 
create facsimiles of documents on a CRT screen. These 
facsimiles of documents can then be used to facilitate easy 
insertion or retrieval into or from a Database Table. Heote 
will serve as a valuable tool for users who are more 
familiar with the order and relationships among entries. on 
the documents than with the standard tabular formats used to 
manipulate data in a database system. 

Form Data Tables are comprised of only two attributes, 
ID and FORM _LINE. Figure 3.17 illustrates the schema for 
the FORM LINE. As shown, the TEXT LINE attribute of a Text 
Data Table and the BODY attribute of a Mail Data Table can 
be used synonymously with the FORM LINE attribute of the 


Form Data Table. 


iD NAME TYPE WIDTH SYNONYM TABLE 
| | FORM LINE | CHAR | 0 | TEXT LINE |-FORM 
| | | | | BODY | 


Figure 3.17 - Schema Table Row For Form Data Table 


Figures 3.18 and 3.19 show examples of two distinct 


types of Form Data Tables available in the ISS. In order to 
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use either Form, the uSer must Specify the name of the 
underlying Database Data Table to be invoked, hereafter 
referred to as the default Database Data Table. 

Figure 3.18 is the simplest type form and can be used 
both for extraction and insertion of data from or into a 
single Database Data Table (file). In this type of form the 
user may operate in two distinct modes: insert or retrieve. 
In the retrieve mode a form with pre=<printed attributes will 
appear on the screen with the attribute designated as a key 
enclosed in braces, { }. The user has a choice of 
specifying a unique tuple to be retrieved or requesting all 
tuples. If he wants a unique tuple he will enter the value 
of the key attribute for the desired tuple. The remaining 
data will be retrieved from the database by the system and 
inserted into the form. In figure 3.18, the user, operating 
in the retrieval mode, entered the value "123-45-6789" for. 
the SS# database attribute. The ISS then used this value 
to search through the database for the matching tuple. and 
automatically filled in the rest of the form with the 
attributes desired from this tuple. 

The user may request that all tuples of the database be 
retrieved by using the KEY word , -ALL. If -ALL is used and 
there are 100 tuples in the associated Database Table, then 
there will be 100 forms filled in by the system. In all 
cases, the FORM _LINE(s) will appear in printed output 


exactly as it appears on the screen, including the data 
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table attribute prompts. The ID of the FORM LINE will not 
be printed. 

In the insert mode the user will supply all the data as 
indicated by the blanks following each attribute. This data 
will be reformatted by the system to conform to the format 


of the underlying database data table before it is inserted 


mace that table. 


ID FORM_LINE 


| | 
-_| | 
| 2] ADDRESS: 2 Lane Rd , Somewhere, Fl. 22222 | 
[= | 
ey | 
| gg | FR Ge GE G* G* GO G*® GO G* GO G8 G* GS GF SF SF EFSF SFEFE EVES SF G* GF GF TG" GT GT Fe G= G= G= G* G* GF GFX G2 OF SE = | 
| 4] OCCUPATION: Systems Analyst | 

| 

| 


Figure 3.18 - Form Data Table 


Forms of the type illustrated in figure 3.19 can be used 
only to retrieve data from one or several underlying 
Database Data Tables (files). In order to maintain database 
consistency, no insertions into the database are possible 
with these forms. Within forms of this type, text appearing 


in the form line can be of four distinct types. These types 
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are distinguished by the presence or absence of special 
symbols within the text entered on the FORM LINE. 

If there are no special symbols within the text entered 
mea «6c FORM LINE, then the text 1s considered to be literal 
and will be printed or displayed as is. Text entered 
between two braces, { }, represent a key attribute of a 
given database data file and will determine which tuple(s) 
of the Database Data Table will be referenced for other 
retrievable data on the form. The format for this data type 
is {table.attribute} or f{attribute}. The latter format 
implies that the default table is to be used. The user can 
enter a unique value to retrieve a Single tuple, or he can 
enter the system variable ~ALL to retrieve all tuples in the 
database. 

The special characters, {}, can appear on a Single’ form 
more than once but there must be a one to one correspondence 
between them and the distinct Database Data Tables 
referenced on the form. For example, row 1 of figure 3.19 
contains the text, {NAME}. In this case, the usSer can enter 
a unique name to retrieve a single tuple from the default 
Database Data Table, or he can use a variable to retrieve 
all tuples from the default database. In row 7, the 
{B.SWEEPSTAKE} entry indicates that the SWEEPSTAKE attribute 
of database data table B will be retrieved. Table B must be 


@mstinct from the default table. 


2 

















ID FORM LINE 
| 1] {NAME | 
|| | 
| 2] [ADDRESS] | 
| | 
Pec imeer ry |, ISTATE ZeZ 1 P | | 
hee | 
| 4] | 
=. | 
| 5| Dear [TITLE] [NAME], | 
| | 
| 6| You may already be a winner in our 
= | 
| 7| {B.SWEEPSTAKE}. In order to find out | 
| | 
| 8] 1£ you are an instant winner read the | 
| | 
| 9| following information. | 
| | 
[10|/Sweepnum. Txt/ | 
joe | | 
EL | 
|| | 
pr 2 | If you are a winner, you may collect | 
|| | 
[13] your prize from [B.ADDRESS], [B.CITY], | 
| | 
[14| (B.STATE]. | 
| | 
[15] | 
i! | 
[16 | Sincerely, | 
| | 
ee? | | 
| | 
[18 | Sweepstakes Deluxe | 
|| | 


Figure 3.19 - Form Data Table 


Text entered between brackets, [{ J], represent variables 
Or attributes to be retrieved by the system from the 


corresponding tuples(s) retrieved previously using a "“"{ }" 
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FORM LINE entry. The format for this type data is 
[table.attribute], or [attribute]. This type data is 
illustrated in rows 2,3,5,13 and 14 of figure 3.19. Note 
that in row 1 the FORM LINE contains {NAME}, and in row 5 
the FORM_LINE contains [NAME]. Even though they refer to 
the same field in the same database table, they differ in 
that {NAME} is used both to extract the value of the 
attribute, NAME, and to determine which tuple will be made 
the current tuple in the default Database Data Table. 
[NAME] simply extracts the value of the NAME attribute from 
the current tuple. In other words, {NAME} means "retrieve a 
new tuple with the given value in the name attribute, 
substitute its value for the symbols, ~{NAME}', and retain 
the tuple as the current tuple of the default database data 
table”. [NAME] Simply means "substitute the symbols 
~(NAME]' with the value of the NAME attribute of the current 
tuple in the default Database Data Table. 

Another type of text which can be entered on the 
FORM LINE is data entered between slashes, / /. This 
indicates that the text to be displayed or inserted will be 
obtained from an existing Text Data Table (file). Its 
format is /filename.TXT/. Mienovwemele Of “figure 3.19, 
/Sweepnum.TXT/ indicates that the text file named Sweepnum 
will be inserted into the form upon view or print requests. 


Forms of the types depicted in figures 3.18 and 3.19 are 
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basic, but powerful enough to facilitate most applications 


involving Form Generation. 


I. DATABASE DATA TABLES 
| The Database Data Tables differ slightly from the Text, 
Mail,Spreadsheet, and Form Data Tables in as much as there 
is no predetermined set of attributes that will apply to all 
Database Data Tables. More precisely, each Database Data 
Table may have one or several attributes in common with any 
other Database Data Table in the ISS. The attribute which 
must appear in all Database Data Tables is, of course, the 
ied. The ID is a key in each Database Data Table. Thus, as 
a user creates a new database data table the only schema for 
this table to appear initially in the Schema Table will be 
that of the ID. As the user defines the schema of each 
additional attribute of the given Database Data Table, these 
schemas will be automatically added to the Data Dictionary. 
The Database Data Tables are used to form relationships 
among entity sets. Each Database Data Table defines one 
entity as described by the attributes or columns, and each 
row of the table represents a unique occurrence of the 
entity. Figure 3.20 illustrates a Database Data Table with 


an ID attribute and N user-defined attributes. 
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ID ATTR-1 ATTR=2 ATTR=n 


1 


| | 
| | 
| sea 

| | 
| 


=< —— 


~~! ~~; 


Figure 3.20 - Database Data Table 


| 
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a 
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J. MAIL DATA TABLES 

mectronic Mail is a utility which facilitates the 
exchange of textual messages among system users. In the 
ISS, each user with mail access rights will have a mail file 
which is essentially a set of messages, i.e. a Mail Data 
Table or set of Mail Data Tables. 

The Schema Table rows for the Mail Data Table are shown 
in Figure 3.21. As indicated in the figure, the TEXT LINE 
Meeerpute of the Text Data Table and the FORM LINE attribute 
of the Form Data Table can be used synonymously with the 
BODY attribute of the Mail Data Table. 

The columns or attributes of the Mail Data Table are 
predefined by the ISS, and as shown in figure 3.22, they 
contain sufficient information to facilitate proper routing 
of mail to the ISS user. Each row of the Mail Data Table 


represent a complete message. 


0 





ID NAME ee WIDTH S YNON YM TAB LE 


FORM LINE| 


| | VIEWED | BOOL | 1 | NONE [ -MAIL 

| a ~MAIL 

I RE —-MAIL 
| —copy ro |Cage | | |e 
| | ERR OE -MAIL 
ae eee MAIL 
|| SY | SE on |S 
| | 

| | 


Figure 3.21 - Schema Table Rows For Mail Data Table 


The VIEWED attribute requires a "yes" or "“no" data entry 
which is used to indicate whether the corresponding message 
has been read by the owner of the Mail Data Table. The 
PROM, TO, COPY TO, DATE and SUBJECT attributes comprise the 
header of the message. The DATE attribute will provide the 
month and day that the message was originally created. The 
remainder of the header attributes are self-explanatory. 
All of the message header attributes can be used to form 
queries which operate on messages satisfying a given 
condition as posed in the queries. 

The last attribute, BODY, contains the text of the 
message. This column will contain as much of the first line 


of the message as will fit one one line of varying length. 
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The remainder of the text will be retrievable by the user 
via a unique mail display operator. 

If the length of the BODY column is insufficient’ to 
contain the full message, this fact will be denoted by the 
special characters, "=>", appearing at the end of the column 
as Shown in row 1 of figure 3.22. Users also have the 
option of typing in the name and type of an ISS data file in 
the BODY column of the message. In this case, the name and 
type of file, preceded by a hyphen, "=", and followed by 
"=>" , will be displayed in the BODY attribute but no 


portion of the text of the file will be displayed. This 


type message is illustrated in row 3 of figure 3.22. 


ID VIEWED FROM TO COPY TO DATE SUBJECT BODY 
=) | NO ISPOT | SAM | PETE | 10-4] GAME | The Game => 
| | | 
| 2] YES {DAN | MARY | LEE {1 10-3] DINNER]! Cancel! 
| | | | | | | | 
ms | YES | IKE | TINA | {| 10-1] DANCE | -Details.TxXT 
| | | | | | | => 
= | | | | | 


Figure 3.22 - Mail Data Table 


In some cases the message body will be terse enough to 
fit, in its entirety, within the allotted columns. In these 
cases, the user will not be required to request further 


display modes for the message, and the special characters, 


Sys) 





"=>", will not appear on the display. This type message is 


[erustrated in row 2 of figure 3.22. 


K. SPREADSHEET DATA TABLES 

An electronic spreadsheet, in its traditional form, is a 
rectangular grid of rows and columns, like a large sheet of 
graph paper, to facilitate organizing numeric data for easy 
calculation and comprehension. It allows the user to model 
a variety of numerical problems in a standard tabular format 
on a computer with the elimination of the drudgery and 
errors encountered by using the scratchpad and pencil or 
hand calculator. 

The schema for the columns of the Spreadsheet Data Table 
ts illustrated in figure 3.23. 

In the Electronic Spreadsheet application, the format of 
the Spreadsheet Data Table is quite different from the 
standard user's view of the spreadsheet. An example of a 
Spreadsheet Data Table 1S shown in figure 3.24. Figure 3.25 
illustrates an example of the standard user's view of the 
Spreadsheet Data Table shown in figure 3.24. 

In the view, the intersection of each row and column 
form one addressable unit or entry position. Each entry 
position in the view is described or defined by a row in the 
Spreadsheet Data Table. Although the ID is a key attribute 
in the Spreadsheet Data Table, it does not have any bearing 


on the order in which the entry positions in the view must 
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appear in the data table. Blank entry positions in the view 
do not have to be described in the Spreadsheet Data Table. 
The X COL attribute and Y_ROW attribute are the column 
position and row position, respectively, of a unique entry 
in the view. The FORMAT attribute of the Spreadsheet Data 
Table contains the information which determines how the data 
of the entry will be displayed in the view. This field 
contains three distinct types of data: the entry data type, 
right or left justification modes, and width of the column 
displayed in the view. 

The entry data type can be of four types: character, 
integer,float, or monetary (dollars and cents number). To 
avoid inconsistencies and maintain readability, the user is 
prevented from declaring different column widths for entries 
in the same column. For example, if entry position A-l has 
a column width of 15, then entry poSitions A-2 through A-=n 
must have a column width of 15, even though the data types 
of the entry positions may differ. 

The VALUE attribute contains either the display value of 
the FUNCTION attribute of the literal string for character 
entry types. The FUNCTION attribute contains either’ the 
Keyword NONE, or an expresSion. The keyword, NONE, is used 
to indicate that the associated value of the entry iS non- 
numerical and will be displayed in the view exactly as it 
was entered. The expressions in the FUNCTION column may be 


a numerical constant or an arithmetic expression. Numerical 
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constants are entered by the uSer and appear in the view in 
the formats indicated by their data type. The arithmetic 
expressions may be composed of system built-in functions and 
operators or user defined functions or combinations of both. 
The operands of these expressions will be numerical 
constants and/or the values of other entry positions. Once 
defined, these arithmetic operations” are automatically 
evaluated by the ISS and their values are displayed in the 
view. For instance, in the example, the data table entry 
M@epoSition B=-5 is the function "B:2 * B:3". When the 
Spreadsheet Data Table is converted into the user's view, 
the value of the function is displayed, (in this case 
12.00). If the user decides to change the value of ‘no. 
ordered! (B-2), or the value of ‘cost per item'(B=-3), then 
the system will automatically reevaluate the function = and 
immediately display the updated entry in the spreadsheet 
view in entry position B=5. 

The user has the option of updating the Spreadsheet Data 
Table directly or indirectly via the view. However, if the 
view is chosen as a means for updating the data table, then 
only the non-calculatable entries can be updated, (i.e., 


literal strings and constant expressions). 
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1B, NAME ieee WIDTH SYNONYM TABLE 


| a ncor [ CHAR | 1 | NONE [ =SPREAD | 
| | Ro || ro | SI 
TENE oe Se 
| FNCETONT | CHRR— | —0— | “NONE | SREB 


Figure 3.23 = Schema Table Rows For Spreadsheet Data Table 
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Figure 3.24 - Spread Sheet Data Table 
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A B c 


| Ka Re RK STOCK] | STOCK2 | 
| | | | 
\NO. ORDERED | oy | 
| | | | 
|COST/ITEM | 4.00] | 
| | | | 
| | | 

| | | 

TOTAL COST | 2 320:01] | 
| | | 

| | | 

| | | 


Figure 3.25 = Spread Sheet View 
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IV. CONCEPTUAL INTEGRATION 


Conceptual integration of the five application areas of 
the Integrated Software System (ISS) has two components from 
a users perspective: a Single data object, and a set of 
common operators which operate on those objects uniformly 
regardless of the application. In addition to these common 
operators, each application must provide its own set of 
specific operations to achieve its special functions. The 
single data object, the table, has already been discussed in 
ehapter IIIf. 

The functional intersection of the five application 
areas is best understood by describing the common operations 
(aS opposed to the actual primitive operators) which are 
required to manipulate data in and extract data froma file. 
The common operations can be performed by the eight basic 
Iss primitive operators, Six of which are based. on 
Relational Algebra, one which iS an aggregate type of 
operator (sort) and one which is a proposed extension to 
Relational Algebra similar to Stonebraker's concatenation 
operator [Ref. 5} described in section B of chapter II. The 
primitive ISS operators are described in section A of this 
Chapter and their mapping to the common operations and the 
utility of those operations are described in section B. The 


descriptions given are of the conceptual view of the 
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operators and operations, not eheir form which is 
implementation detail. That is, their semantics and not 
syntax are described. 

A special table called "“ISSBLANK" is defined to. be 
simply a blank row of any table. This convenience allows 
the primitive ISS operators to be set theoretic with both 
operands and results being tables. 

The ID values in a table are the conceptual ordering of 
the relation in the system and thus is maintained by the 
system during operations on the _ table. RowS are always 
numbered contiguously from 1 to the number of tuples in the 
relation. Manipulation of the ID values in a table by a 
user is prohibited except indirectly via the re-ordering 
(and re-numbering) required after insertion and deletion of 
rows. New tables created by operators have their ordering 
imposed by the system as they are formed. 

The effect of a primitive operator may be a change to 
the operand or the creation of a new table or both, 
depending on the operator. A created table can be assigned 
to (overwrite) an existing data table, which may or may not 
be one of the operands, or anew table. This is analogous 
to a conventional programming language variable assignment x 


fFunct(x) except that the variables are now entire tables 


and the function is an ISS _ primitive operator. When 
Operators are nested they operate on the result of the 


previous operator. 
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In addition to the primitive ISS operators provided for 
data manipulation in tables, there is ae set of binary 
operators which operate between data tables. These complex 
ISS operators, which contribute to the extensibility of the 
ISS, may be applied to tables within and sometimes across 
application areas. In many cases, in view of the meaning of 
the structure of non-database data tables, inter-type 
operations are conceptually meaningless. The semantics of 
these operators and their utility is discussed in part C of 


this chapter. 


A. PRIMITIVE ISS OPERATORS 

The eight primitive ISS operators insert, modify, 
delete, project, select, union, Sort and concatenate perform 
their functions on tables of any type in any application. 
Some operators are designed to change a target table while 
others are designed to extract data from a table creating, a 
new table for further use. The delete operator does both. 

The modify, delete and select operators require a 
condition which specifies which rows of the data table are 
to be manipulated by the operator. The operands of the 
condition are literal or numeric constants, arithmetic 
expressions, column names or column values in the _ table. 
The literal expressions may contain pattern matching 
characters. The operators of the condition statement are 


the logical and arithmetic operators and the arithmetic 
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relational operators, (less than, equal to, etc.). One 
common condition used will be specification of a line number 
or range by using the ID values. For example: ( ID=10 ), or 
for a range: ( ID>=3 AND ID<=14 ). 

It should be mentioned here that although efficiency 
is not a concern of the ISS at this time, operations which 
cause rearrangement or re-ordering of rows in large data 
tables such aS large Database Tables, will be very 
inefficient due to the amount of processing required. 

l. Insert 

Insert changes a target table by inserting into it a 
table at a specified location (ID value). If no ID is 
specified the new tuples are appended onto the end of the 
target table. The columns of the table to be inserted must 
contain a subset of those in the target table. The matching 
of column names is’ satisfied by equivalent attributes or 
Synonymous attributes as defined in the Schema Table. 
Matching of synonyms is in the leftmost sense from both 
tables. Unmatched attributes in the target table have _ no 
value placed in the new rows. Effectively, the insert does 
row and column insertion, adding the extra structure 
required to the rows from the inserted table. The result of 
an insertion is the changed target table. Since the 
location of the insertion may be internal to the target 
table, the ID values of the inserted tuples and _ the 


subsequent tuples must be re-numbered by the system. 
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Z- Modify 
Modify alters a target table by changing the value 


of specified columns in a row or set of rows to new values. 
The rows to be modified are determined by their satisfying a 
Given condition, all tuples which match the condition have 
their values changed. The new value to be placed in the 
rows chosen, may be specified absolutely or as a function of 
the old value to be replaced. 

One special use of this operator is modification of 
the material under the curser during browsing or editing of 
a relation. The curser in this case has two values 
associated with it, the ID number (row) and the column name 
in which it displayed. Direct editing of the displayed old 
value becomes the new value. This use of the modify 
operator will be managed by the browsing software. 

3. Delete 

The delete operator changes a target table by 
deleting all rows which satisfy a given condition. In 
addition the delete operator creates a temporary new table 
of the same type and structure as the target table 
containing the deleted rows. It is the table of deleted 
rowS which is’ returned by the operator when considering 
combined operators. The system manages the re-numbering of 


the ID column values in the target and deleted rows tables. 
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ae Pinomicct 


The project operator creates a new table consisting 
of just those columns of the old table which are specified, 
in the order that they are specified. The appropriate 
columns of each row are copied into the new table, with rows 
in the same order as the original table. It is important to 
note that the resulting table will not in general be a table 
of the same type as the original table unless both are 
Database Tables, which have no specific schema like Text, 
Form, Mail and Spreadsheet Tables. (See chapter III for 
fixed schemas of non-database data tables.) In all cases 
except the trivial case of projection of an entire table, 
the projection will have a different structure than the 
operand table. For example the projection of a Mail Table 
on the FROM and TO columns gives a result with three columns 
(ID, FROM, TO) which is not a Mail Table or any of the other 
fixed schema type of tables. 

9. select 

The select operator creates anew data table from 
all rows of a given table which satisfy a given condition. 
Its effect is similar to delete except the selected rows are 
not deleted from the given data table. The result of the 
operator is a newly created table of the same type and 
structure as the operand, however no assumption can be made 


about the order of the rows in the new table. 
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6. Union 
Union operateS on two data tables of possibly 
dissimilar types, producing a resultant data table. The two 
operand tables are of unequal status in that one, for _ the 
sake of a convention, will be referred to as the "left 
operand", dominates the other in that it determines’ the 
Structure of the resultant table. The other, or "right" 
operand table of the union operator must have ae_eset of 
columns (attribute names) which iS a sSuperset of the left 
operand. That is: each column name in the left operand (and 
hence result) must also be found in the right operand, or in 
other terms, an appropriate projection of the right operand 
could produce a table of the same structure as the left 
operand. Synonymous attribute names as defined in the 
Schema Table are considered equivalent for purposes of 
structure comparison. If the right operand table has a set 
of column names which contains two or more aliased names for 
a column in the resultant, the first column from the left 
will be used for the union. 

The operator makes an identical copy of the dominant 
table and then appends to it copies of the shortened tuples 
from the other table. Like set union, a row from the second 
table is not added to the resultant if an equivalent row has 
already been included from the first table. If the 


resultant table is assigned to the left operand then union 
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may be viewed as an appending of one tables rows onto the 
other. 

Union is different from the insert operator in three 
major ways: it truncates the longer table, it performs 
theoretic set union on the set of rows and it always adds 
the right table to the end of the left table. 

7. Sort 

Sorting of a data table on the values in a specified 
set of columns creates a new table of the same type and 
Structure with the values in the ID field corresponding to 
the increasing or decreaSing sorted order of those columns. 
That is, the display order and conceptual ordering of the 
rows in the table are now sorted in increasing size of the 
values in the columns on which the sort was performed. If a 
column is numeric then the sort is done in numerical order, 
otherwise it is done in lexicographic order. 

In some sense the sort operator iS an aggregate 
operator in that ie is the result of many basic 
manipulations which could be performed with much difficulty 
by uSing the other primitives. 

8. Concatenate 

Concatenation iS a specialized operator which 
Creates anew text data table from any other type of table. 
The values of specified columns of the operand table are 


concatenated into one single resultant table, row by row 
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with each field in a row separated from the next by a space. 


The system provides the ID field for the resulting table. 


B. REALIZATION OF ISS OPERATIONS 

The desired operations on logical files of the ISS, 
regardless of the application, are easily realizable in 
terms of the primitive operators described above. These 
operations are applicable to all types of data tables (Text, 
Form, Mail, Spreadsheet and Database) and system tables 
(SCHEMA, ISSLOGIN, Directories and security tables), 
although the concern of this thesis is primarily the 
manipulation of data tables. The utility of the operations 
depend on the application, for example the moving of rows 
within a table is absolutely essential for Text Data Tables, 
but hardly useful at all ina database application. This 
BPeetion describes the operations in terms of the ISS 
Primitives and their utility in the text, form, mail and 
spreadsheet applications. The usefulness Ot these 
operations on Database Data Tables depend entirely on_ the 
meaning of the tables. Since form creation is a text 
processing application, descriptions of operations on Text 
Data Tables always apply to Form Data Tables as well. The 
desired content of a form can be as general as any document. 

It must be understood that when discussing the 
Spreadsheet applications, there are two dimensions to the 


descriptions given. The underlying data table drives’ the 


We 





presentation of the traditional view and rows and columns of 
the data table do not correspond to rows and columns of the 
view. Each row of the data table corresponds to a filled in 
position in the conventional view. The absence of a row in 
the data table for an entry position in the view simply 
means it is displayed as a blank. An empty Spreadsheet Data 
Table corresponds to a blank spreadsheet view. In the 
following descriptions an attempt is made to draw the 
distinction between view and data table manipulations. 

It is not the intent to repeat the full analysis of 
possible uses for operators carried out by Nishimura [Ref. 
|. 

1. Removal Of Rows 

Removal of rowS in a data table is performed by 
applying the ISS delete operator with a condition to select 
the tuples for deletion. Ina text or form application, 
removal of row corresponds to the deletion of a line of 
text, which may be selected by ID number or a pattern match 
on the lines of text. The table of deleted rows created by 
the delete operator can be usSed as part of an undo facility 
in the editing environment of the text application area. 

Removal of a Mail Data Table row or set of rows, is 
Simply the final disposal of a message or set of messages 
from a mail table. The table created by the delete operator 
may be used to transfer mail to another table, such as a 


dead letter box. The semantics of the delete operator 
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requiring a condition to be given, allows for great 
meexibility in defining the set of messages to be deleted. 
Rows of a mail table may be deleted according to date of 
message, range of dates, whether viewed or not, range of 
date and whether viewed, individual names, pattern matches 
on sender and/or addressee names, subject pattern match = and 
many other combinations given the mail table structure 
described in chapter III. 

In the spreadsheet application, removal of a row or 
number of rows ina Spreadsheet Data Table corresponds to 
the blanking or setting to "“empty"™ of an entry in the 
traditional spreadsheet view. Since the delete operator is 
specified with a condition on one or more of the Spreadsheet 
Data Table columns, data table rows and hence view, entries 
may be deleted according to a large range of criteria. This 
is in contrast to current traditional spreadsheets which can 
delete entries or "blank" by row, column or block only. 

Deletion of a row or column in the spreadsheet view 
is somewhat more complex and must be done by a combination 
of operators. To remove a row in the view entries in that 
view row (Y ROW = view row) must be removed (leaving a blank 
row), then all data table tuples for subsequent spreadsheet 
rowS must have their Y ROW values incremented by 1, forcing 
the view mechanism to now display them one row higher. A 


Similar procedure is required for columns in the view. 
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2. Location Of Rows 

The location of a particular row or a set of rows of 
a data table is simply an application of the select operator 
described in subsection B.5 with the desired condition 
imposed. 

This operation may be used to display particular 
rows of a data table or to browse through a file. Browsing 
is affected simply by locating contiguous blocks of ID 
values. This type of browsing would be particularly useful 
for Text and Form Data Tables where the ordering of rows is 
meaningful. Using pattern matehing, conditions this 
operator allows pattern search within a document. 

Similarly, Mail Tables can be browsed using the 
select primitive operator, allowing a freedom to produce 
message Summaries based on any criteria rather than being 
restricted by a predefined message summary. 

In spreadsheet applications, with the traditional 
spreadsheet view in mind and an understanding of the 
underlying model (a Spreadsheet Data Table as described in 
section III), a single entry position can be located by 
Bemecting the row with the condition that xX COL and Y_ROW 
match the desired entry. In addition, the location of a set 
of rows in the data table correspond to search actions 
throughout an entire spreadsheet view. Traditional 
Spreadsheet views generally reference only one entry 


position at a time whereas the ISS primitives select allows 
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an entire set of entry positions to be located in a_e single 
operation. A set of rows (entry positions in the view) may 
be selected according to their poSition, function, value or 
even their format. 

3. Addition Of Rows 

Tuples may be added into data tables by applying the 
ISS primitive operator insert. The BLANK table may be 
inserted in the case of new tuples, or Some other table for 
simple additions of existing tuples from another table. An 
empty row inserted would need subsequent modification to 
fill out its values. Appending rows to a table is simply 
insertion at the last ID. Insertion may also be done by 
using the ISS primitive operator union and assigning the 
result to the required table. The union operator will § add 
rows onto the end of the table, adding only those rows from 
the other table which are not already in the resultant 
table. Generally insert will be more useful in the 
applications which deal with tables where row order is 
important (text and forms), and union in the others. 

The general utility to be able to add rows to a 
table needs hardly be mentioned except to describe the 
effect in each application area. A Text or Form Table 
during creation or modification requires a facility to 
insert new lines at specified locations, followed perhaps by 
modification. During form processing the forms mechanism 


must dynamically insert entire tables where indicated, 
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allowing standard blocks of text to be embedded in each of 
the output documents. Any of the fixed data table types may 
be inserted into a Database Data Table, providing new data 
tuples with some of the column values already entered. 

In the mail application addition of anew row in a 
data table corresponds to creation of a new message. 
Addition via an inSert or union command of a number of rows 
selected from another Mail Table would get copies of some 
other user's messages. Messages left in the care of another 
user's Mail Table could be collected by inserting the result 
of a deletion from that table (delete those for which 
addressee is the eventual owner) into the addressees mail 
table. A union of Mail Tables would create a list of unique 
messages received by a set of users. 

In the spreadsheet application the addition of a ruw 
in the data table corresponds to the filling in of an entry 
position in the view. When uSing union and insert with 
Spreadsheet Data Tables care must be taken to ensure view 
emery positions are not duplicated, that is the xX COL and 
BeeoWw values together form a unique key. One of the 
Spreadsheet application area specific functions will need to 
be the management of view row and column insertion 
procedures, manipulating the Spreadsheet Data Table to 
effect the changes required in the view. Insertion of a 


(blank) row or column in a view is done by modifying the 
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Spreadsheet Data Table and was discussed in the _ previous 
section. 
4. Updating Of Rows 

Updating of data table rows is performed by applying 
the ISS primitive operator modify with the appropriate 
condition to select the desired rows. In this way, data 
table values may be changed to new values explicitly given. 
The insertion of the blank table and subsequent updating 
completes the creation of new rows in a table. 

Updating of text and form lines is clearly uSeful as 
Enis is merely part of the ordinary editing function. 
Global changes can be effected uSing pattern matching in the 
condition for modification. For normal text editing and the 
filling in of blank rows of other data tables, this raw use 
of the ISS primitive modify operator is very clumsy, and a 
control program such aS mentioned in Subsection A.2 to 
browse and edit is required. For example the deletion of a 
sentence spanning a number of lines would require updating 
of the lines where the sentence begins and ends and deletion 
of any intervening lines. 

Updating a Mail Data Table is necessary after the 
BLANK insertion, to fill in the details of the message to be 
composed or to modify a message before "transmission". 
(Adding the message row to the addressees' Mail Tables.) 
Modification of the subject and message body of one Mail 


Data Table row and "retransmission”™ could ensure identical 
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addressees for a sequence of messages, a function normally 
handled by routing lists. Similarly just the date and 
addressees could be altered to send the message to an 
amended distribution list. The mail application specific 
software could provide simple procedures to assist the user 
to effect these transactions. 

Updating of Spreadsheet Data Table rows is essential 
to the use of these tables as a modeling tool. By 
thoughtful use of the conditions in the modify operator, a 
powerful set of changes can be realized. 

Entries in the view are placed in another location 
Syecimply modifying the values of the X COL and Y_ROW in the 
appropriate data table row, to the desired row and column. 
The values in entire rows in the view may be moved to 
another row by modifying Y ROW in the data table tuples. to 
the new value. That is: modify rows in which Y ROW = R, 
Smenging the Y ROW column value to R + 1. When the modified 
data table is mapped to the view all the entries which were 
in row R will be mapped instead to row R+l. 

Formats for display in the view are changed easily 
by updating the format string in the FORMAT column of the 
data table. The format can be changed by single entry, by 
view rows or columns, globally or by any composite condition 
required. 

Although some adjustment is required to think of a 


Spreadsheet in terms of the ISS Spreadsheet Data Table, once 
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accomplished, both the traditional view and the underlying 
conceptual model can be used to great advantage. For the 
model to be useful, of course, the spreadsheet application 
must have application specific operations to force 
spreadsheet related functions such as: recalculation of 
derived values after user changes’ to the entry position 
values; change of entry function expressions involving moved 
operands; increment or decrement X COL and Y ROW values when 
rows in the view have been moved, deleted or inserted; and, 
check on inconsistencies in the model such as duplicated 
X COL,Y ROW pairs. 
5. Copying Of Rows 

The copying of rowS within a data table may be 
performed using the ISS primitive operator select followed 
by insert. Selection of the rows to be copied creates a new 
table which then may be inserted at the desired location. 
Similarly, copies of tuples from one data table may be 
copied into another table of the same Structure by using a 
select followed by a union onto the desired table. 
Generally it would be preferable to use the union operator 
in non-text or non=-form applications where order is 
unimportant and duplicated rows are undesirable. 

The utility of copying lines in a text document or 
form is self evident, as it is a standard document 
Preparation feature. ElveeCeronic: Mare and Spreadsheet 


applications have little use for the simple copying of 


om 





tuples per se, the utility being in the subsequent 
modification of the duplicated row. 

The mail examples given in section 4 could be 
preceded by a copy operation, thus keeping a complete record 
of all messages or versions of a message sent to a group in 
the first example, and keeping a complete record of dispatch 
dates and addresses in the second. 

In the Bercadeheet application the copy operation 
would be useful more in filling out repetitive rows of the 
Spreadsheet Data Table, subsequently modifying the necessary 
Seo Lumns. 

6. Moving Of Rows 

In the ISS, tuples may be moved within a data table 
by applying the delete operator to the desired _ rows 
identified by ID number or other condition, followed by an 
insert of the deleted rows at the required location. It is 
of benefit only in situations where ordering of rows is 
important. 

Im the text processing and form generation 
application areas, this operation is a generalization of the 
normal move operation allowed by editors. By the use of the 
Condition in the delete operator it enables the moving of 
non-contiguous lines selected on content alone without’ the 
need to know line numbers. 

In spreadsheet and mail applications’ the move 


Operation is not particularly useful as the ordering of the 
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rows in these data tables has little meaning. The operation 
may be performed to satisfy personal preference when 
browSing the entire Mail or Spreadsheet Table. 

7. Sorting Of Table Rows 

Sorting is done in the ISS by the operator sort 
which can be thought of as a Series of selective moves to a 
new data table. It is useful to Some extent in all 
cation areas for producing hard copy listings of all 
rows for manual reference and personal preference when 
browSing a file. Although not discussed in this thesis, 
Sorting a file on a key field will have considerable impact 
on retrieval efficiency in an enhanced system which is 
concerned with storage and retrieval efficiency. 

In a text application a textual list could be 
deleted, sorted and re-inserted in its original location. A 
series of numbered one line points in a document or form may 
be sorted after haphazard insertion. 

Mail Data Tables may be sorted according to the 
users desires. For example a ouser may like to regularly 
browse his mail listings sorted by addressee, sender or 
date. The data tables could be sorted and saved in that 
order or into a temporary file for display purposes only. 

Spreadsheet Data Tables may be sorted on X_COL and 
Meeow values to maintain a listing the user is comfortable 


With. It is possible that users may become comfortable 
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dealing with a Spreadsheet Data Table directly, bypassing 
the view during data entry and modification. 
8. Merging Of Columns 

The requirement to merge columns is’ essentially so 
that columns which are conceptually different fields may be 
merged into a Single text field for display, or insertion 
into a document. The operation is performed using the ISS 
es tenate operator. 

Uses of the column merge are to include part or all 
of some data table in a text document or form. For example, 
a letter detailing names and addresses of cuStomers with 
overdue payments in an accounts database may be drafted. 
After drafting, a concatenated projection of the name _ and 
address fields of the rows selected from a customer database 
table on the condition of negative balance, can be inserted 
into the letter. Similar examples can easily be concocted 
for the use of merging columns in Spreadsheet and mail 
Tables. 

9. Operator Combinations 

The basic ISS operators alone and simply combined 
provide low level powerful data manipulation tools to effect 
the operations described above. Combinations of operators 
Can provide much of the high level activities required by a 
user. Many of the examples in the preceding sections are 
ones which use two or more primitive operators to achieve 


their purpose. 
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One mpataiculanhly Important Combination of operators 
is that used to modify the structure of existing Database 
Tables (the others are fixed) without the re-entry of 
existing data. This may be done by first using the ISS 
primitive insert to create a new Database Table by inserting 
a temporary name in the database directory and giving it the 
required columns. The old table may then be inserted into 
or unioned with the new, producing a result which may then 
be assigned to the new table. Insertion would be used when 
the new structure is a superset of the old and union when 
the new structure is a subset of the old, since insert will 
add columns when necessary and union will remove them. The 
old table may then be removed from the directory with delete 
operator, and the new table renamed to the old with the 
modify operator. 

It is the function of the separate application areas 
to provide the mapping between the high level application 
specific operations and the low level primitives. Utilities 
in the kernel should provide convenience tasks such as the 
relations browser previously mentioned and multiple data 
entry routines, to repetitively apply. the primitive 
operators. Utilities should also provide "walkthrough 
paths” for table creation where the system requires more to 
be done than just simple addition of a row to a Directory 
Table. This applies particularly in the database 


application where new attributes may have to be defined in 
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the Schema Table and, where the full security provisions 
alluded to in previous chapters is applied. Although these 
functions are to be performed by the applications specific 
and utility software, the user can still retain the 
underlying Relational Database model in his thinking and 
understanding of the system, and is always able to revert to 
the primitive operators for data manipulation and retrieval 


regardless of the application area in which he is working. 


oF Et SBNSIBILITY OF ISS 

mime main thrust of the ISS is that it provides’ the 
capability to combine tables of different types to form new 
relationships and derive new information. This extension of 
the ISS is made possible by incorporating into the system 
some high level combining operators. Those chosen for this 
System are Union, Set Difference, Intersection, Join, and 
Natural Join. Using these operators on tables of the same 
or different types adds much depth to the scope of the ISS. 

Haphazard use of these operators can result in 4a 
Syntactic error or a new table which is’ semantically 
meaningless. For example, to apply an operator which 
requires two tables of the same structure to tables of 
different structure would be syntactically Imac Onn eC € . 
Moreover, some combinations like a join of a Mail and 
Spreadsheet tables is syntactically correct but has no 


foreseeable meaning given the nature and design of the 
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tables. If an attempt is made to apply an operator 
incorrectly, the ISS sets a trap which prohibits the user 
from performing the operation and displays an error message. 
In the cases where the application of the operator is deemed 
only semantically meaningless, the ISS trap will issue a 
warning to the user. If the warning trap is invoked, then 
the user has the option of aborting the operation or _ to 
continue executing it. 

imethe following parts of this section each of the 
combining operators are defined and described in general 
terms, examples are presented which illustrate how using the 
operators on tables of different types can result in some 
meaningful relationships, and in cases where combining 
tables uSing a given operator is meaningless or incorrect, 
the traps are discussed. 

The semantic meaningfulness of combining tables of the 
Same type uSing any of the. operators discussed in this 
section is considered to be obvious and will not be 
discussed further in this thesis. An elaboration on this 
type table combinations (Intra-Type Combinations) can be 
found in [Ref. l]. 

ie Union 

The union operator has been described previously in 
Subsection 4.B.6 of this thesis. Conventionally, the union 
operator requires tables to be of the same arity. However, 


the ISS union operator embeds a default projection (the 
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mechanics of which have been discussed) which lifts’ the 
mamey constraint Erom tables. It should be noted that the 
user can effectively override the default projection by 
first proceeding a union of two tables by a projection 
Mperator which projects out attributes of the user's choice 
from one or both of the tables. The user could then apply 
the union operator on the resultant table(s). Thus, the 
extension of the meaning of the conventional union operator 
is not a limiting factor, but to the contrary, it enables a 
wide variety of table combinations and new relationships to 
be achieved. 
The following subsections illustrate how performing 
a union on tables of different types can be semantically 
meaningful. In the discussions the "+" is used to denote 
the union operator. 
ae Text + Form 

Pioowiswa Textebate Table which contains™ the 
body of a letter which is to be sent to all employees of a 
given company, and Bis a Form Data Table which extracts 
the name and addresses of all the employees of the company 
and places in letter head format, including a_e salutation, 
then performing the operation, A + B, would be very useful. 
It would yield a repetitive letter addressed to each 
employee and the resultant table would be a Form Table. 

As another example, if Ais a Form Data Table 


mmeh contains intricate tax details for residents of a 


88 





given state and B is a Text Data Table which contains a 
textual description or explanation of some of the columns on 
the tax form, then, B + A would yield a new relationship 
which contained both the explanation of the Form columns and 
the Form itself. The result would be a Text Data Table. 

Since the FORM LINE and TEXT_LINE can be 
aliased, and the two tables are of the same arity, no 
projection would be necessary. This is essentially the same 
as an Append. 

b. Text + Database 

If A is a Text Data Table and contains the names 
of the local stores which sold shoes and B is a Database 
Data Table which contains as one of its attributes the names 
of local stores which sold clothes, then A + B would be 
meaningful. First, ISS would project out all non-applicable 
attributes of table 8B. Then it would yield a Text Data 
Table which contained a list of stores that sold either 
Shoes, clothes, or both. ISS would eliminate any duplicate 
tuples in table B. 

If the order is reversed, i.e., B + A, then this 
is only meaningful if table B contains only two attributes, 
[ID and STORES (or any column name which could be aliased with 
TEXT LINE). Applications for tables of these types are 


numerous. 
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Gam Texte + eMail 

This operation iS semantically meaningful in 
Smy one direction. If Ais a Text Data Table and B is a 
Mail Data Table, then the only feasible union is A + B. 
This would yield a eresultant Text Data Table whose tuple 
contained the TEXT LINEs of table A and the BODY of table B. 

d. Form + Database 

This combination has application similar to 
those described for TEXT + Database since the Form Data 
Table is essentially a "special" Text Data Table. 
Therefore, they both have the same semantic meaningfulness. 

e. Form + Mail 

This operation is sSyntactically correct, and 
thus semantically meaningful, in only one direction. For 
instance, if Ais a Form Table and B is a Mail Table, then A 
fees Syntactically correct. 

One application using this operator would be to 
have the BODY of a message in table B disseminated to all 
employees in letter form. To effect this, one could union 
table A, a form which yields the letter heading for each 
employee, to table B. 

f. Spreadsheet + Database 

This operation is only meaningful ok the 
attributes of the Database table correspond to the schema of 
the Spreadsheet Data Table. Piso, cnen Enis Union is 


meaningful in both directions and virtually would be the 
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same aS unioning two Spreadsheet Tables or two Database 
Tables with identical attributes. It 1S recognized that 
this application would be very rare. 

g.- Database + Mail 

If A is a Database Data Table which contains a 
subset of the attributes of a Mail Data Table then this 
application is semantically meaningful and syntactically 
correct. A + 8B would then result in a Database Data Table 
with tuples from the Mail Data Table (after necessary 
projections) or from the Database Table or both. In the 
other direction, i.e. B + A, this operation would only be 
possible if all the attributes of the Mail Data Table also 
appeared in the Database Data Table. 

The following combinations of tables using the union 
Operator are semantically meaningless or syntactically 
mieorrect. 

h. Text + Spreadsheet 

If A is a text data and B is a Spreadsheet Data 
Table, then A + B and B + A are both syntactically 
incorrect. Neither order meets the requirements that’ the 
attributes of the right operand be a superset of the 
attributes of the left operand. Since this operation is 
Mieerrect, it will be prohibited by setting a "prohibit" 


Seep condition. 
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Peden. + exe 
This is syntactically incorrect in one direction 
only. If A is a Mail Data Table and B is a Text Data Table, 
then A + B is incorrect due to the fact that both tables 
have predefined attributes and the text table can not be a 
Superset of a Mail Table. This operation will be 
prohibited. 
j- Form + Spreadsheet 
This operation is essentially the same as that 
discussed in the Text + Spreadsheet operation. 
weeeeMally + Form 
This operation is syntactically incorrect in one 
m@erection only. Due to their similarity, the discussion 
presented in the Mail + Text operation applies here as well. 
l. Spreadsheet + Mail 
This 1S not meaningful nor syntactically correct 
in either direction due to failure to meet the Superset 
constraints. This operation will be prohibited. 
2. Set Difference 
Given two data tables, A and B, set difference is 
defined as the set of tuples in A that are not in B. This 
operator requires that tables A and B be of the same arity. 
The following parts of this section illustrate how 
performing the set difference operator on data tables of 
different types can be semantically meaningful. The "= is 


used in the discussions to denote set difference. The 


o2 





resultant table of each operation below will be of the same 
type as the table used as the left operand. 
a. Text - Form 
If A is a Text Data Table which contains’ the 
literal text of a corresponding form in a Form Data Table, 
B, then A - Bor B =- A would be very meaningful. For 
instance, if there are many tuples in tables A and 8B and 
table A has been slightly modified, then performing the 
operation, A - B or B = A, provides a quick and easy way of 
determining which tuples of the Text Table were modified 
without the user having to do a line by line comparison of 
the text. 
b. Text - Database 
If A is a Text Data Table and B is a Database 
Data Table, then A =- 8 iS meaningful only if table B 
contains exactly two attributes, the ID and another 
meerioute , “b™, of data type, character. In this case, set 
difference would yield the tuples in A that are not in B. 
Presumably, the TEXT LINE attribute of table A and attribute 
"b" would range over the same domain. 
c. Spreadsheet - Database 
This operation We only meaningful in 
applications of a limited nature. For instance, if Ais a 
Spreadsheet Data Table and B is a Database Data Table, then 
A - 8B 1S meaningful only in the cases where the attributes 


of table B are synonyms or aliases of those in table A. EE 
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table B meets those specifications then it could contain a 
subset of a spreadsheet defined and maintained in table A. 
Using set difference on these tables will yield entry 
positions which are defined in A but not in B, or it will 
reveal discrepancies in entry positions that are defined in 
both tables. For example, assume entry position G-l was 
defined in both tables but the function attribute contained 
a value of " 10 * G:2 " in one table and a value of " 1000 * 
G:2 " in the other. Applying the set difference operator to 
these tables would help to maintain consistency within the 
tables. 
d. Database - Mail 
As with the Spreadsheet - Database operation, 
Database - Mail operation is only meaningful in the special 
cases where the attributes of the Database Table are aliases 
of the predefined attributes of the Mail Data Table. If B 
is a Database Data Table which meets those conditions and A 
is a Mail Data Table, then A =- B would be useful in many 
applications, and would be essentially the same as 
performing the set difference operator on two Mail Data 
Tables. 
The following combinations of tables using the set 
difference operator are either semantically meaningless, or 


Ewmeactically incorrect. 
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e. Text =- Spreadsheet 
The Text and Spreadsheet Data Tables do not have 
the same arity and there is no correspondence, in general, 
among their attributes. Thus, this operation is both 
meaningless and syntactically incorrect and will generate a 
meronibit trap". 
ff. Text - Mail 
The same argument applies for performing set 
difference on the Text and Mail Data Tables as given in the 
Text - Spreadsheet discussion above. 
g- Form - Spreadsheet 
The same argument applies for performing set 
difference on the Form and Spreadsheet Data Tables as given 
in the Text - Spreadsheet discussion above. 
fee corm - Mail 
The same argument applies for performing set 
difference on the Form and Mail Data Tables as given in the 
Text - Spreadsheet discussion above. 
i. Text =- Database 
Applying set difference operator to a Text and 
Database Table is syntactically incorrect only if the arity 
of the Database Table is not the same as that of the Text 
table or the Database table attributes are not synonyms of 
those in the Text table. In these cases, the Text = 


Database operation will set a "prohibit" trap. 
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j- Form - Mail 
The same argument applies for performing set 
difference on the Form and Database Data Tables as given in 
the Text - Database discussion above. 

k. Spreadsheet - Mail 

The same argument applies for performing set 
difference on the Spreadsheet and Mail Data Tables as given 
in the Text - Spreadsheet discussion above. 

_l. Database - Mail 

The Same argument applies for performing set 
difference on the Database and Mail Data Tables as given in 
the Text - Database discussion above. 

3. Intersection 

Given two data tables, A and B, the intersection of 
A and B is the set of all tuples that are in both table A 
and table B. Both tables used aS operands Or the 
intersection operator must be of the same arity. In 
addition, the corresponding columns or attributes of the 
tables must be aliases of each other. 

The following parts of this section illustrates how 
performing the intersection operator on data tables of 
different types can be semantically meaningful. The "&" is 
used in the discussions to denote intersection. The 
resultant table of each operation below will be of the same 


type aS the table used as the left operand. 
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a. Text & Form 
If Ais a Text Data Table and B is a Form Data 
Table, then the operation, A & B, would be useful in 
applications to determine the commonality between the text 
in table A and the literal text in the form in table B. 
b. Text & Database 
If Ais a Text Data Table and B is a Database 
Data Table, then as with the set difference operator, A & B 
is meaningful only if table B contains only two attributes, 
the ID and another attribute , "b", which can be aliased 
mene the TEXT LINE attribute of the Text Data Table. In 
this case, intersection would be meaningful in many 
operations. For instance, assume table A contained a list 
of students taking Trigonometry and table B contained in 
attribute "b" a set of all students taking Chemistry. Then, 
A & B would yield a set of tuples corresponding to the names 
of students enrolled in both courses. 
c. Spreadsheet & Database 
As with the set difference operator, the 
intersection of a Spreadsheet Data Table, A, and a Database 
Data Table , B, is only meaningful in limited applications 
and in the very specialized case where each of the Database 
table attributes is an alias for one and only one 
Spreadsheet table attribute. If table B meets’ these 


constraints then, A & B would be essentially the same as 
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performing the intersection of two Spreadsheet Data Tables 
or two Database Data Tables with identical attributes. 

d. Database & Mail 

As with the Spreadsheet & DatabaSe operation, 
Database & Mail operation is only meaningful in the special 
cases where the attributes of the Database table are aliases 
of the predefined attributes of the Mail Data Table. If 
table A is a Database Data Table which meets those 
conditions and table B is a Mail Data Table, then A & B 
would be useful in many applications, and would be 
essentially the same as performing the intersection of two 
Mail data tables. 

Using the intersection operator is either 
Syntactically meaningless or semantically incorrect on the 
same combinations of tables as using the set difference 
operator. For that reason and to avoid redundancy or 
duplication of the rationale, the reader is referred _ to 
Subsection Gre w2% ( the discussion on the semantic 
meaningless and syntactically incorrect uses of the _ set 
difference operator on certain table combinations). 

4. Join 

A join of two data tables, A and B, is defined only 
if the two tables each have some attribute (column) which is 
defined over some common domain and the join is over those 
two corresponding columns. The result of the join would be 


anew table of the type database, which contains tuples 
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qualified by the attributes of both tables. In other words, 
the structure of the resultant table would contain all the 
attributes of table A, followed by all the attributes of 
table B. Each row in the new table would be formed by 
adjoining tuples of table A with tuples of table B such that 
the new tuples in the resultant table satisfy the predicate 
(condition of the join) in the common columns. 

In the ISS we expect that most joins will be formed 
Meeeean equality predicate, i.e. a condition of "equality" 
between the values in the common attributes of table A_ and 
table B. This type join is called an equijoin. When the 
equijoin operator is used, the duplicate column is not 
eliminated, therefore, this operator can result in 
redundancy in the database unless followed by a projection 
operator to eliminate the common values. The ISS requires 
the common attributes to be identified in the Schema Table 
as aliases or synonyms of each other. 

The semantic meaningfulness of combining tables of 
different types uSing the join operator is discussed below. 
There is a Significant difference between the applications 
discused in Nishimura's analysis, [ref. 1], and _ those 
presented here. In that reference the applications 
described are based on the asSumption that the user will be 
able to use the ID attribute of a given table in performing 


binary operations with the operators. Since that assumption 
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is not made in this thesis, the applications presented by 
Nishimura are not valid herein. 

The symbol "*" will be used in the discussions’ to 
denote the join operator. 

ae Text * Mail 

The join of a Text Table, A, and a Mail Table, 

By, is very practical. If the TEXT LINE ranges over the same 
domain as the FROM, TO, SUBJECT, COPY TO,VIEWED, and DATE 
attributes of the Mail Table, then A * B has many 
applications. As an example of its applicability, Suppose 
table A contained a list of subjects considered high 
merority and assume each TEXT LINE of table A contained only 
one of those subjects. Then, the equijoin of A * B would 
form a new relationship which groups messages in table B 
which are of the same subject and which are also considered 
high priority. Similar applications can be done uSing the 
other type joins and/or Mail table attributes. 

Dee emms:~ Mail 

TALS 1S Similar to the applications of Text * 

Mail. However, in most cases the Form Table will contain 
some FORM LINES which are not actually values in the domain 
of the corresponding Mail Table attribute. For instance, 
uSing the same example as used in the Text * Mail 
application, more than likely some of the Form LINES will 
have data that is not equivalent to a Subject in the Mail 


system. 


100 





c. Spreadsheet * Database 

If table A is a Spreadsheet Data Table and table 
B is a Database Data Table, then the meaningfulness of 
Performing A * B is dependent upon the context of table 6B. 
As an example, assume that table B contains the attributes, 
ID, X COL, Y_ ROW, and DESCRIPTION. Then, the equijoin of B 
* A would provide the user with a table that contains the 
subset of the spreadsheet whose entry positions were 
described in table B. The new relationship would give the 
user a clearer underStanding of the entries in the original 
Spreadsheet Table. 

It should be noted that this application can 
never have an effect on the spreadsheet view since the 
resultant table will always be a Database Data Table, not a 
Spreadsheet Data Table. Performing operations on 
Spreadsheet views are not discussed in this thesis. 

d. Database * Mail 

On page 68 of [Ref. 1], Nishimura presented a 
very detailed analysis of the applicability of joining 
database and Mail Data Tables. These operations will not be 
repeated here, but suffice it to say that with the exception 
of the join created using the ID attributes, the examples 
presented therein are considered feasible. 

It should be pointed out though that the result 
of applying a join to a Database Table and a Mail Table 


would be a Database Data Table which contains all of the 
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attributes of both tables. The resultant table, by no 
means, is ever amail data table as alluded to by the 
examples. Further procesSing would be necessary to convert 
the resultant table into a Mail Data Table which conforms to 
the ISS Mail Data Table format. 

The following subsections briefly discuss tne 
combinations of tables that are viewed as semantically 
meaningless or syntactically incorrect using’ the join 
operator. 

e. Text * Form 

The join of a Text Table, A, and a Form Table 8B 
can be Eoucoumeamoyntactically = On the TEXT LINE and 
Beem CINE attributes of the corresponding tables. However, 
this operation is viewed as having little significance or 
usefulness. The join (equijoin), A * B would result in the 
intersection of the two tables that would just be two copies 
of the same text and would not contribute to any useful new 
relationship among the tuples of the resultant table. 
Furthermore, joins on these two attributes will usually 
result in TEXT_LINES or FORM LINES which are much too long 
to be displayed or printed on the same line. TCE 0° L 
this observation, this operation would tend to be more 
annoying to the user than helpful to him. Any join 
involving this combination of table types would invoke a 


"warning”™ trap. 
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f. Text * Spreadsheet 
The join of a Spreadsheet Table and a Text Table 
is syntactically incorrect. None of the attributes of the 
two tables are aliases of each other and rarely do they 
range over the same domain. This application would invoke a 
Meronibit" trap. 
g. Text * Database 
The join of a Text Table, A, and a Database 
mole, 8, can only be performed on the TEXT LINE attributes 
of table A and some attribute "b" of Table B. Joins of this 
type seem to serve little purpose in as much as no new 
relationship results, nor is there any outstanding inference 
made from the join. Since the join is done on a tuple 
Mees, at best if the Text Table has only one TEXT LINE it 
can be joined through a "not equal join" to produce a new 
relationship, but not a very useful one. Since there are so 
few realizable applications for this join, it too will 
invoke a "warning" trap. 
h. Form * Spreadsheet 
The join of a Form and Spreadsheet Data Table is 
Similar to the Text * Spreadsheet application discussed 
above. 
i. Form * Database 
This combination will invoke a warning trap. 


See the Text * Database discussion for the rationale. 
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j. Spreadsheet * Mail 

Pao OWmeedOlLeSeOkNEnES type is syntactically 
macorrect. The attributes of the Spreadsheet Data Table in 
no way have anything in common with the attributes of the 
Mail Table. The schemas for both tables are predefined and 
there has been no proviSion to alias any of the attributes. 
This, then, supports the argument for the infeasibility of 
combining these type tables through a join. 

Wie add TELOnMMEO | SYMeactic infeasibility, this 
operation does not result in a useful new relationship since 
the semantics of the attributes of the spreadsheet and Mail 
Data Tables have nothing in common. Since this particular 
meme is syntactically incorrect, it will set the “prohibit” 
trap. 

5. Natural Join 

The natural join of two data tables is ae sSpecial 
case of the equijoin which was discussed in section E.4 
above. Sinplistically “stated, the natural join is an 
equijoin followed by a projection. The result of this 
operator contains no duplication Since the common column is 
eliminated by the projection. 

The semantic meaningfulness of using the natural 
join to combine tables has been discussed implicitly in the 
join section with each reference to the equijoin, therefore 


TE will not be elaborated on further. 
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6. CpebaeoNs Summary 


Figure 4.1 Summarizes the information presented in 
this section. The matrix depicts the type trap that will be 
generated by the system when the user tries to use operators 
to combine tables whose combination would be either 
semantically meaningless or syntactically incorrect. I£ the 
table combination is only semantically meaningless then the 
"warning” trap is generated and a message advising the user 
of its potentially meaningless results is displayed. The 
warning trap may also be generated when the operation 
performed on the two tables is deemed to have only a limited 
number of cases where it can be syntactically correct. ce 
using an operator to combine two tables is syntactically 
incorrect in all cases, then this operation generates a 
"prohibit" trap and a message is displayed to the user which 
informs him that the transaction has been prohibited. 

In figure 4.1, the small letters denote the 
Operators, i.e. "u" = UNION, "i" = Intersection, "s" = SET 
PeerPeRENCE, "Jj" = JOIN, and “n" = NATURAL JOIN. If an 
operator does not appear in an entry in the matrix then this 
implies that using that operator to combine the indicated 
table types is both syntactically correct and semantically 
meaningful. The column and row headings indicate the types 


of the data tables used as operands. 
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Figure 4.1 = Operators Usability 
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Vo USER INTERFACE 


This chapter focuses on the conceptual design of the 
user interface to the Integrated Software System. For any 
software system such as the ISS, it is paramount that’ the 
user interface is well designed. Two important design 
issues to be considered are the ease of uSe and Simplicity. 
In other words, the system should be constructed in Such a 
way that the uSer has only a few things to learn in order to 
use the system effectively. These thoughts were the driving 
forces behind the design of Graphics Prototype Interface 
(GPI), which is the ISS Data Manipulation Language (DML). 
GPI and its major features are thoroughly discussed in 
section A of this chapter. 

Since the thrust of this thesis is the design of the 
integration of the selected applications, little mention is 
made of application specific user interfaces. However, some 
consideration was given to the issue of the uSer making 
transitions from ISS integrated modes to application unique 
modes and vice versa. This is discussed briefly in the GPI 
Command Mode commands section. Further application specific 
user interface design details are considered to be outside 
the scope of this thesis. 

One other area that 1S not addressed in the user 


interface is access rights and authorization. These issues 
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add another dimension to the User Interface design structure 
and are highly important. Their absence from this thesis is 
not an omission, nonetheless, since they are viewed as 
matters of system security which are not implemented herein. 
The framework for providing the user interface to effect 
security is, however, embedded within this chapter. 

Section B of this chapter presents a brief overview of 
the implementation of the conceptual design of the user 


interface to the ISS. 


pee GRAPHICS PROTOTYPE INTERFACE 

Graphics Prototype Interface (GPI) is the Data 
Manipulation Language designed for ISS. It adopts some of 
the constructs of Query By Example (QBE), a domain calculus 
DML. Both GPI and QBE are designed for interactive terminal 
use in composing queries and provide graphic table displays 
to formulate user transaction requests and system responses. 
GPI is primarily a tuple based language, however, it also 


incorporates features from both relational algebra _ and 


domain calculus as well. In addition, it contains some 
additional features (such as insertion, deletion, 
modifications, arithmetic Capabilities, printing 


Capabilities, etc.) which are not part of the algebra or 
calculus but serve to increase the power and usefulness of 


the language. 
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The constructs or commands of GPI can be divided 
logically into four parts. First there are the High Level 
System commands which are used to control cursor movement 
and browsing of displays and to change the mode of the 
system. Second, there are Command Mode commands’ which 
perform functions such as "RUN" and can be issued only when 
the system is in the command mode. Third, there are Query 
Mode commands which can be issued only when the system is in 
the query mode, for example the "“UNION" command. And 
fourth, there are Help Mode commands which can be issued 
only when the system 1s in the help mode. The commands 
which are specific to the Help Mode are not addressed in 
this version of GPI. 

These command categories,(with the exception of the Help 
Mode commands), and their uSage are discussed below. Also, 
a discussion on the general layout of the CRT display for 
queries is presented. 

ie General CRT Layout 

The general structure or layout of the CRT is 
depicted in figure 5.1. The figure illustrates the maximum 
amount of information that can be displayed on the screen at 
one time when the system is in the query mode (to be 
discussed later). This layout is based on the assumption 
that the viewing area is a 24 lines, 80 columns CRT screen. 
In order to prevent overcrowding the screen and baffling the 


uSer with too much information displayed at once, some 


aaeke) 





necessary limitations are embedded in the GPI query display 
mechanism. 

As shown, the screen is divided into four distinct 
areas: two data table skeleton areas, a condition box area, 
and a command/response area. For readability, each is 
Separated by two blank lines. The different types of areas 
are discussed below. 

ae Table Skeleton Areas 

The table skeleton areaS are reserved at all 
times during the query mode. Each table skeleton occupies 
exactly seven lines of screen area. The name and type of 
table will appear on the first line of the table area. Line 
two will be blank and on line three will be the data table 
attribute names, preceded by the special column heading, 
"COMMAND". The COMMAND column is used by the user in order 
to specify what command is to be performed on the selected 
tuples of the table. To distinguish this column heading 
from one of the attributes, it will be highlighted. 

Lines five through seven of each table skeleton 
area are referred to as the query lines and are reserved for 
user query entries. The use of the query lines to formulate 
queries will become clear in later discussions. 

GPI provides a mechanism for the user to change 
table skeletons. This display facility is described in 


detail in section 3. 
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Daeconadition Box Area 

The concept of a condition box was adopted from 
the QBE language. The condition box area is used to define 
conditions placed on attribute variables which are used in 
the query. It follows the second table skeleton area and 
occupies three lines on the CRT. The heading, “CONDITION 
BOX ", is displayed on the left side of the condition box 
area. The box area to the right side of the heading is 
reserved for the user to specify conditions placed on 
variables which represent attributes from the data tables. 
For example, if "X" is a variable name used in the query 
line of a table, a further condition can be on the value of 
the attribute which "X" represents by placing a statement 
meeeas " °-X > 10 " in the condition box (note that the 
question mark preceding the variable "X" in this example is 
part of the syntax and iS not a part of the actual variable 
name). This statement would inform the system that the user 
wants to include in the query only those tuples which have a 
value greater than 10 in the attribute corresponding to 
variable “X". 

Several unrelated conditions may be placed on 
the same line of the condition box, however they must be 
Separated by a ";" character. The conventional arithmetic 
symbols are valid in writing GPI arithmetic statements, 
meee. +, -, *, /, C3 ye In addition, the comparison 


Operators, (Such as >, <, >=, <= ) can be used. The logical 
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operator symbols used are: "&&" (and), "I|" (or), and "7" 
(not). 
c. Command/Response Area 
The Command/Response area occupies only two 
lines. It is used by the system and by the user. When the 
user iS operating in the command mode, all Command Mode 
commands ,i.e, RUN, CREATE, etc, must be entered in the 
Command/ResponsSe area. In addition, once a user runs a 
transaction the system issueS a reSponSe in this area to 
inform the user immediately as to where the results of the 
transaction can be found or if further actions are necessary 
to complete the transaction. If a trap has occurred in an 
attempt to execute a transaction, that information is 
displayed in this area as well. 
2. High Level System Commands 
The High Level System commands are of two types; one 
type 1s for cursor control and the other type is for 
changing the mode of the system. Thies Cuesor “control 
commands are effected by Special keyboard keys and their 
function is to allow the uSer to easily browse or scan _ the 
display. These commands also enable the uSer to edit data 
entered on the table skeletons or condition boxes with ease 
and they can be used regardless to which mode the system is 
in (GPI operates in three distinct system modes: Command, 
Query, and Help. Fach of the modes allow mode specific 


operations to be performed). 


Tigz 





Each High Level System command is defined below. 
The first six are cursor control commands and the next three 
are mode change commands. Each of the Mode change commands 
are preceded by the symbol "“" which represents the control 
keyboard key. This is to indicate that the control key must 
be pressed simultaneous to the corresponding mode character 
in order to activate all mode change commands. 

a. ~~ = move the cursor up one line from current 
location 

b. v= move the cursor down one line from current 
location 

Cc. > = move the cursor one character position to 
the right 

d. < = move the cursor one character poSition to 
the left 

e. tab key - move the cursor to the beginning of 
the next field 

IEF carrlage return =- move the cursor to. the 
beginning of the first field in the next line 

g. °C = change the system to the command mode 

h. “Q = change the system to the query mode 

i. “H = change the system to the help mode 

In the next two sections the specific commands which 
can be run only in the command mode and the query mode are 
discussed. The commands which are unique to the Help mode 


are not presented since the Help Facility is not implemented 
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in this version of GPI. The Help mode was mentioned here 
nonetheless, Since its usefulness and importance 1S 
recognized. 
3. Command Mode Commands 

GPI provides several commands which are unique to 
the command mode. These commands are entered by the user on 
the Command/Result area of the CRT display and are initiated 
following a carriage return. In order to lessen the 
probability of the user inadvertently executing a command, a 
safety check is embedded in the commands which affect the 
current data table. 

Each of the commands are described in detail below. 
In order to lift the burden of typing in long command words, 
the uSer may use the default and type only the first two 
letters of a given command. Also, in cases where the 
command requireS one or more parameters, if the user types 
in only the command the system will prompt him for the 
parameter values. 

a. Table 

The TABLE command is used to draw a new table 

skeleton. The format for this command 1S : TABLE <table 
poSition> <table name> <table type> . Valid entries for the 
parameter, <table position>, are either "1" or "2". These 
numbers correspond to table skeleton area one or two as 
displayed on the CRT. The parameter, <table name> and 


<table type> refer to the name and type of the desired data 
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Eable skeleton that the user wants displayed in the 
specified table area. Valid table types are -TEXT, FORM, 
SPREAD (spreadsheet), MAIL, and DB (database). 

If a table position of 1 is issued then the 
system will automatically erase the table skeleton already 
displayed in table area one and replace it with the table 
skeleton whose attributes correspond to the table name and 
table type specified. For example, if area one contained a 
text data table skeleton and the user issued the following 
command : TABLE 1 message mail , then the text data table 
skeleton would be erased and replaced with a mail data table 
skeleton. The mail data table named "“message“ would then 
become the current data table in table area one. 

b. Erase 

The ERASE command is used to erase user supplied 
information from the table skeleton areas or the condition 
box. The format for this command is : ERASE <position(s)>. 
Meera entries for <position(s)> are "1", "2", "3", or "*". 
Position number 1 or 2 correspond to the respective CRT 
table skeleton area. Position number 3 corresponds to the 
condition box area and a position entry of "*" refers to all 
three of these areas. For instance, if the command, "ERASE 
*", was entered, then all query information entered by the 
user in the data table areas and the condition box area 
would be erased. The current skeletons for the areas would 


be redrawn. 
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Cre cla 

The RUN command is issued in order to execute a 
transaction (query). The format for this command is simply 
: RUN. In cases where the execution of a transaction will 
change the contents of either of the current data tables 
(transactions involving INSERT, DELETE or MODIFY Query mode 
commands) the user will receive a "COMMIT? " response from 
the system. The user must then respond with a "yes" or "no" 
answer. If the answer is yes then the current data table 
will be updated after completion of the execution of the 
transaction. This extra step provides a safety feature 
which prevents the uSer from making irreversible mistakes. 

d. Saveto 

The SAVETO command 1S used to Save the contents 
of a system defined table, ISSRESULT, into a table of the 
user's choice. The ISSRESULT table is a temporary system 
table which contains the results of the most’ recent 
successfully completed transaction. The table that is named 
as the recipient of the save must be the same type and 
Structure and type as the ISSRESULT table. If the recipient 
table does not exist already, then the system will 
automatically create a table whose structure is the same as 
the ISSRESULT table and assign it the user specified name. 
The format for the command is : SAVETO <tablename> <table 


type>. 
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e. Undo 

The UNDO command allows the user to reverse the 
effects of the most recently committed transaction. It 
takes no parameters and thus its format is just : UNDO . 

iw DLrectoLry 

This command will display the tuples of the 
specified Application Directory table. Zecwetormat. 1S 
DIRECTORY <application type> . Valid entries for the 
<application type> parameter are TEXT, DB,SPREAD, MAIL, and 
FORM. 

Once the Directory command is issued, the system 
will automatically clear the screen and will display as many 
tuples of the specified directory as will fit on one Screen. 
The user will be prompted by the system response "MORE?". A 
user response of "yes" to the queStion will initiate 
pemotling of the tuples in the Application Directory table. 
Once the uSer responds with "no" , the scrolling will cease 
and whatever contentS were present on the CRT before the 
DIRECTORY command was issued will be redrawn on the 
terminal. 

Ge. Application 

This command enables the user to suspend any 
actions being performed and to enter one of the five ISS 
Application specific modes. The format used is : 


APPLICATION <application name> . 
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AS an example, if a user desires to use _ the 
capabilities unique to the Spreadsheet Application Software, 
then she would enter the command “ APPLICATION SPREAD ". 
The current contents of the screen would be erased and the 
user would be placed in the SPREADSHEET application. The 
user would then perform operations unique to the spreadsheet 
application such as displaying a spreadsheet view. Once the 
user terminates the application session, she may continue 
from where She left off prior to executing the APPLICATION 
command. 

h. Create 

The CREATE command iS used to create new data 
tables. It has several embedded prompts or subcommands 
which make the task of creating new tables very elementary. 
Its format iS simply : CREATE . 

Once the command is issued the user will 
Progress through a series of system prompts which will 
ultimately build the data table. The first system prompt is 
for the table name followed by a request for the table type. 
If the type is one of the predefined Structures, ie TEXT, DB 
(Database), FORM, SPREAD,or MAIL, then the prompting session 
is over at this point and the system will add the new data 
table to the appropriate Application Directory table. 

If the new table is a database data table then 
the prompting SeSSion continues’ and the user is asked to 


provide the attribute name, data type, character width. 
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Also the user is asked if the attribute is a key field. 
After the user has completely defined the structure of the 
new database data table and terminates the prompting 
session, the new table will be added to the Database 
Meplication Directory. Also, the system will then 
automatically update the Schema Table as neccessary. 

Following the CREATE command, the system will 
issue a response on the Command/Response area to assure the 
user that the new data table has been created and added _ to 
the appropriate Application Directory table. 

i. Delete 

In the command mode, the DELETE command refers 
to the deletion of a table from the ISS database. The 
format is DELETE <table name> <table type>. Before 
executing this command the system will first echo the user's 
request and will ask the user if he is sure that he wants to 
execute the command. If the user response is no then the 
command is aborted. Otherwise, the specified data table is 
deleted from the system by erasing all references to the 
table in the application Directory table and other system 
tables. 

mee )6=C CUTE 
The command, QUIT, enables the user to exit the 


ISS system. The format for this command is simply : QUIT. 
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4. Query Mode Commands 


The Query mode commands of GPI provide the user with 
a very simple technique for writing both elementary and 
complex queries. A subset of the commands is based on the 
Relational Database Model and corresponds to the ISS 
primitives and combining operators which were discussed _ in 
chapter 4. 

As mentioned earlier, each table skeleton display 
has a command column in addition to the data table attribute 
columns. It is in this command column that the user 
specifies query commands to be performed on selected tuples 
of the underlying data table. 

Query mode commands can be used in conjunction with 
user specified variable names, literals, or order indicators 
(described below). The variables are preceded by a question 
mark. They are used as domain variables in cases where the 
user desires that the operation indicated by the command in 
the command column be performed only on tuples with certain 
conditions placed on attributes represented by a variable. 
Variables may also appear in the command column provided 
that they are preceded by a query command. Variables of 
this type are referred to as tuple variables since they 
represent the entire tuple as opposed to only one attribute. 
Variables which are used to express conditions other than 


equality are further defined in the condition box. 
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Any eschdtactwer string which 1S not preceded by-an "@" 
or "2?" and appears in an attribute column is considered to 
be a literal. If a literal appears under an attribute name 
then this implies that the query command only applies to 
tuples in the data table which have the same value as_ the 
literal in the specified attribute. 

Order indicators can be used to specify attribute 
precedence or reordering of attributes. As an example, if a 
data table is to be sorted on two or more attributes, then 
the order indicator can be used to specify which field 
(attribute) is the primary sort field, secondary sort field, 
ere. The syntax for the order indicators is any integer 
preceded by an "@" Sign, i.e. "@1", and if used they must be 
placed in the column of the attribute to which they apply. 
The lower number indicates’ higher precedence. Order 
indicators can also be used with the project, print or 
display command if the user desires to see the table 
attributes Stored, printed or displayed in an order 
different from the way they appear in the data table. 

The definition and usage of the individual Query 
mode commands are given below. The semantics of many of the 
commands have been defined in Chapter 4 and will not be 
repeated. However, many examples will be presented in order 
to show how the user can use the GPI language to effect the 


Gueries. In addition, the use of the variables, literals, 
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and order indicators will be more clearly illustrated in the 
examples. 

Each query command must be preceded by the symbol 
"1" and at least the first two letters of the command must 
be typed. The user may, however, choose to type the entire 
command. 

a. Delete 

The deletion operator is effected by the GPI 
Bommand "DELETE". The tuples deleted by this command are 
Stored in a SyStem database data table named ISSDELETE. 
ISSDELETE automatically assumes the Same Structure of the 
table from which tuples were deleted. Plgiine 5.2) 1S an 
example of a GPI query to delete all the tuples whose ID is 
greater than 3 and less than 10. The command !DE is entered 
on the first row underneath the command column. The "?X" is 
a variable that represents the ID attribute and it 1S _ used 
to qualify which records of TableA are to be deleted. The 
qualification of the tuples is made complete with the 
condition statement in the condition box. 

After executing this query tableA would no 
longer have the TEXT LINES which corresponded to ID numbers 
4 through 9 of TableA prior to running the query. However, 
BHemlooDELETE table would contain the six deleted TEXT _LINEs 


and its structure would be the Same as a text data table. 
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b. Insert 

The Insert primitive operator corresponds to the 
command INSERT. It is used to insert a tuple or table of 
tuples into a target data table at a specified location (ID 
value). The command may be unary or binary, 1.e. operate on 
one or two data tables. If it is used as a binary operator 
then the results of the command are stored in the table 
whose structure appears in the first table skeleton area. 
The contents of the second table will be unaltered by the 
execution of the INSERT command. 

If the user desires that the tuple or table be 
inserted into the target table at a given location or ID 
value, he may indicate this by appending the value of the ID 
to the end of the INSERT command in the first table. For 
instance, figure 5.3, part A, gives an example of a query to 
insert TableB, a text data table into TableA, a form data 
Bele, at location 3. 

Part B of the figure ShowS an example of the 
INSERT command used as a unary operation to insert a Single 
Maple at location four of the target data table. The 
literals ZiOuUGS ae Owen and "Brac™ constitute the 
attribute values of the inserted tuple. 

Part C illustrates a query to insert TableB, a 
Smaller data table (one with fewer attributes) into TableA, 
a larger data table. Since the attributes of TableB are a 


Subset of those in TableA, this is a valid insertion and is 
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consistent with the discussion of the semantics of the 
Insert primitive operator given in chapter 4. Before making 
the insertion the system will add the REVIEWER attribute to 
TableB, but the Reviewer value for each tuple in the table 
will be blank. Since there is no ID value specified in the 
INSERT command, the tuples of TableB will be appended to 
those in TableA. 

Figure 5.4 shows the contents of TableA after 
the execution of the query in part C of figure 5.3, assuming 
that each table had only one tuple. Note that the REVIEWER 
attribute of the "ET" tuple from TableB is empty. 

ee Clodl fy 

The Modify primitive operator corresponds to the 
GPI command, MODIFY. The execution of the command will 
directly alter the contents of the underlying data table. 
It can be used in several ways. Pawn Of Elgure. 5.5 
represents GPI syntax to perform a query to increase the 
Salary of all employees listed in TABLEB by ten percent. In 
this example a variable and a condition is used to effect 
this modification and all tuples are affected. 

Part B represents the query " change Ge 
Thompson's salary to $100,000 and JOBTITLE to manager ". 
Both literals and attribute variables are used to perform 
the query. The variable "Y" listed under the NAME attribute 
and qualified in the condition box by an equality statement 


lets the system know to only modify the tuple in tableA that 
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fas a value of “G.- Thompson™ in the name attribute. The 
literals "manager" and "100000" are the values that will 
replace whatever was in JOBTITLE and SALARY attributes of G. 
Thompson’ tuple entry. 

Gee Project 

The Projection primitive operator is effected by 
the command PROJECT. This command does not change the 
operand data table and thus is committed as soon as the RUN 
command is issued. PROJECT will create a new data table of 
the same number of tuples as the table in which the 
projection was performed. The attributes of the new table 
will consist of the system generated ID and the indicated 
subset of the attributes of the original data table. The 
new table will be stored in the ISSRESULT temporary data 
table. (Note that the ISSRESULT table can be copied into a 
permanent user data table by uSing the SAVETO command = as 
discussed previously). 

Figure 5.6 illustrates a query to project out 
the NAME and SALARY attributes of TABLEB. It also shows the 
Structure of the ISSRESULT table after the projection query 
terminates. 

e. Select 

The SELECT command corresponds to the Select 
primitive operator. It 1S used to extract tuples of the 
Current data tables which satisfy some specified condition 


and store them temporarily in the ISSRESULT table of the 
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Same structure. The SELECT command is quite versatile and 
is often used in the same query as other commands. 

Figure 5.7, part A Shows an example of a query 
to select all tuples whose NAME value is "P. Harrison" and 
MOBTITLE is “laborer". The literal "P. Harrison" in the 
name attribute and "laborer" in the JOBTITLE attribute 
determine which tuples are selected. In part b, the syntax 
expresses a query to find the tuples of all laborers in the 
data table whose salary is $9,000. Les cemecarns two 
literals, "laborer" and "9000" on the same query line and 
this implies an " AND" relationship. In other words, if a 
literal or variable appears in the columns of more than one 
attribute and on the same query line, then a “logical and" 
on the values of the attributes is done to determine which 
tuples are selected. Part c of the figure shows the same 
query as part b, but uses variables and the condition box in 
the syntax. Using the condition box adds greater query 
flexibility since it allows one to place any number of 
conditions on the variables. For instance, if there was a 
desire that only the laborers whose salary is greater than 
$10,000 be selected, then the entry in the condition box 
mould be "= ?Y > 10000. 

Ee | Union 

This command is used to perform the union 

Primitive operator whose semantics has been discussed in 


chapter IV. Since union is a binary operator, the UNION 
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command always operates on two different data tables 
simultaneously (this is the case with all commands” which 
represent binary operators). The data table which appears 
in table area one is referred to as the dominant table (or 
left operand) and it determines the structure of the results 
from the union. The table that appears in table area two is 
called the subordinate table (or right operand) of the 
operation. 

Consistent with the Union operator discussion in 
chapter IV, the UNION command may be performed on tables of 
different arity as long as the attributes of the dominant 
table are a subset of the subordinate data table. The 
syntax required to form a union query 1S very simple and is 
indicated in figure 5.8. All that is required is that the 
command !UN (or !UNION) be placed in the first row of the 
command column of each data table. Part a of figure 5.8 
represents a query to form the union of a text data table 
and a form data table. TableA, the text data table Is the 
dominant table in this query so the attributes of the 
ISSRESULT table will have the attribute names ID _ and 
TEXT LINE after execution of the query. The union of these 
two tables is feasible since TEXT_LINE and FORM LINE are 
aliases of each other. 

Part B of the figure illustrates a query to 
perform the union of a text data table and a mail data 


table. The results of this union would also have the 
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attributes only of the text data table. The contents of the 
ISSRESULT table would be all the TEXT _LINE's of MTABLEA 
Mettowed by the BODY of all tuples in TABLEC, the mail 
table, which were distinct from the TEXT LINE'’s of MTABLEA. 
The BODY attribute is used in the union Since it is an alias 
femeene TEXT LINE attribute. The rest of the attributes of 
the mail data table would be ignored by the system. 
g. Intersect 

The INTERSECT command performs the intersection 
binary operator. It operates on two data tables of the same 
arity. In addition, each attribute in the dominant table 
must have one and only one attribute in the subordinate data 
table of which it is an aliaS or Synonym and vice versa. 
The results of the INTERSECT is a set of tuples that are in 
both the dominant and Subordinate data tables. Figure 5.9 
Shows a query that will take the intersection of a text data 
table and a form data table. Since IN is the short form of 
the INSERT command, IT 1s used as the short’ form of 
INTERSECT in order to distinguish the two commands. 

meer ~Sdift 

The SDIFF command corresponds to the set 
difference operator. Is a binary command that requires both 
the dominant and subordinate data tables to be of the same 
aelty . ts poses the same alias constraints on _ the 
attributes as the INTERSECT command. SDIFF will compare the 


two data tables and store in the ISSRESULT table all tuples 
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that are in the dominant table that are not in_ the 
subordinate table. Figure 5.10 illustrates a query to take 
the set difference of two mail data tables. 
ee, Gerdes aia 

JOIN is a binary command that operateS on _ two 
data tables which may be of the same or different arity. 
The semantics of the command is discussed in chapter IV. It 
is only meaningful if the attributes on which the join is 
performed range over the same domain. Figure en 
illustrates the syntax to perform the query " join tuples of 
tableB to tuples of tableA if the subjects are the same ". 
This essentially is an equijoin on the SUBJ attributes 

In the figure the variable "Q" is used to denote 
which of the attributes of the tables are used to form the 
join. The fact that the same variable appears in both 
tables under the appropriate attribute column implies an 
Semryoln. (If the condition of the join is other’ than 
equality then two different variables must be used and the 
relation between them must be stated in the condition box). 
The ISSRESULT table will contain the results of the query 
and the structure will be all the attributes of MTABLEA 
followed by all the attributes of TABLEB. Note that this is 
redundant since both of the SUBJECT attributes will appear 


in the results and will contain duplicate data. 
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ise NOUN 

The NJOIN command is used to perform the natural 
join binary operation. By "derimrelon, Matural join 1s an 
equijoin with the duplicate attributes removed or projected 
out of the resultant data table. Forming a query to perform 
a natural join of two data tables follows the same structure 
aS a query to effect an equijoin (except of course the 
command entered would be !NJ or !NJOIN as opposed to !JO or 
!JOIN). 

Wem cone a t 

The CONCAT command 1S used to perform the ISS 
specialized operator, concatenate. This command is unary, 
i.e. it operates on only one data table at a time. Le 2s 
used to concatenate the values of the attributes of a data 
mMaore Of arbitrary arity into a text data table structure. 
The values of the attributes are separated by one space. 
The results of the operation is stored in the ISSRESULT 
table and the operand data table is not altered. 

Figure 5.12 showS an example of a query to 
convert a database data table into a text data table. Part 


b shows the results as they would appear in the ISSRESULT 


table. 
im OOT t 
The SORT command is not related to any ISS 
primitive or combining operator. fees included in: the 


command set because of the recognition of its great utility 
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and its enhancement to the ISS. It provides the capability 
for the user to sort a data table on any number of 
attributes in ascending or descending order. The default 
Sort order is ascending. 

The user designates the precedence of the sort 
field or attribute by placing an order indicator (an integer 
preceded by the "@" sign) in the column under the attribute 
name. The integers must be unique and the lower one takes 
the higher precedence. If descending sort order is desired 
on any field then the order indicator should be followed by 
the letter "D". Figure 5.13 shows an example of a query to 
sort a mail data table in descending order based on the date 
and within the each date sort on the TO attribute in 
ascending order. No variables or condition statements are 
required to effect a sort. 

m. Display/Print 

A Data Manipulation Language is not very useful 
from the user's viewpoint if it does not provide a facility 
for users to form queries that involve output. The commands 
DISPLAY and PRINT are included among the Query Mode commands 
for that reason. DISPLAY is used to direct output to the 
Bees and PRINT directs output to the line printer. Both 
commands operate on an entire data table which could either 
be the original current data table or the ISSRESULT table. 
In other words, if the PRINT or DISPLAY command is the only 


command in a query then all tuples of the original table 
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will be printed (displayed), however if PRINT or DISPLAY is 
preceded by other commands in the same query, then all 
tuples of the ISSRESULT table will be printed (displayed). 
(Note that the use of attribute variables with a PRINT or 
DISPLAY command is simply ignored by the system since’ these 
commands operate on tuples only.) 

The PRINT and DISPLAY commands can operate in 
two output formats. The default format is to output the 
attribute in table form with each row representing a _ tuple 
and each column representing an attribute name (the names of 
each attribute would appear at the top of each column). 2 a 
the default is desired the user may simply type the command 
(PRINT or DISPLAY) or he may append a "1" to the end of the 
command. Figure 5.14 shows an example which uses format 
one, the default. As illustrated in the figure, if the user 
requires that the attributes be printed in different order, 
then the order indicators could be used to determine the 
printing sequence for the attributes. The results of the 
Query are also shown. It should be noted that the generates 
temporary output which is not stored in the ISSRESULT table. 

The other output method will display or print 
each of the attribute names on a separate line followed by 
the attribute value and it is invoked by appending a "2". to 
the end of the DISPLAY or PRINT command. This method limits 


the amount of information that can be displayed on the CRT 
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as compared to the default format. Figure 5.15 illustrates 
a display query which invokes format method two. 
eee =Combinin uery Mode Commands 

Section 4 explained the query mode commands, defined 
the command syntax required for each of them, and 
illustrated their use by presenting various examples. iin 
each example, only a single command was used in each query. 
This section addresses the issues of compound queries, i.e. 
those involving more than one query command in the same 
transaction. 

Certain precedence rules apply when the user 
combines query mode commands. First of all, the order in 
which the commands appear in the table skeletons is 
important. Straying from conventional programming languages 
somewhat, if more than one command is listed in a table 
skeleton then ISS will execute the commands in a "bottom up" 
order and each subsequent command will operate on the data 
table results of the previous command in the query. Also, 
if two data tables are used in the query, then commands that 
apply only to table number two will be executed before any 
commands in table number one will be performed. Commands 
which apply to both data tables will be executed last and 
following the convention, they should be the appear on _ the 
first query line of both data table skeletons. Daas 
condition also imposes another important rule that must be 


followed when using multiple commands. The rule is that 
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only one binary command is allowed in one transaction. This 
convention is necessary in order to keep the syntax simple 
and limit the amount of meaningless queries that can be 
formed by the user. 

Another important issue to address is the use of 
domain variables in multiple command queries. In the 
queries where only one command was used this issue was not 
pressing. But, when there are several commands used either 
in the same data table or within both data tables a few 
extra rules are necessary. For example, if the same domain 
variable is used in an attribute of table one and also in an 
attribute of table two, the convention is that only the 
tuples of both tables that have the same value in those 
attributes will be operated on. 

Tuple variables were mentioned briefly in the GPI 
mieroduction. These tuple variables, if used , must be 
preceded by some query mode command in the COMMAND column of 
the data table skeleton. They indicate that the command 
applies to the entire tuple and they are optional in cases 
where their absence presents no ambiguity. If a variable is 
used as a tuple variable in a transaction, it can not be 
used aS a domain or attribute variable in that same 
transaction or query. 

Tuple variables differ from attribute (domain) 
variables in yet another way. If the same tuple variable 


appears in several different commands of the same data table 
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it implies that the subsequent command only applies to 
tuples that were qualified by the prior command. FOG 
Example, in figure 5.16 the tuple variable "Z" appears both 
with a SELECT command and a MODIFY command which operate on 
the given data table. In this case, the SELECT command will 
be performed first and the tuple that has a value of "WEEZY" 
in the name attribute will be selected and assigned to the 
variable "Z". The MODIFY command will be performed next but 
will only apply to tuples which are assigned to variable 
"Z". If no tuple variable was associated with the MODIFY 
command in this example, then execution of the query would 
update all addresses in the data table to "1 Popeye st" 
instead of just modifying Weezy's address. 

An example which illustrates most of the issues 
discussed in this section is shown in figure 5.17. The 
query represented operates on two data tables, Salary and 
Personnel. The query corresponds to “update all employees' 
Salaries by twenty percent and join the Salary and Personnel 
tuples of all single male employees who work in either 
department A or department B and whose salary is $45,000 or 
more*. 

The SELECT command in table 2, Personnel, selects 
tuples from the data table whose SEX attribute equals "male" 
and whose marital status equals "Single" and assigns them to 
tuple variable "Z". These tuples are the only ones from the 


Personnel table that will be utilized in the JOIN. This is 
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indicated by the presence of the same tuple variable, "Z", 
following the JOIN command in the Personnel table skeleton. 

Before the join is done, commands which pertain only 
to the Salary data table are executed. In this table, the 
MODIFY command will be performed first ame will increase the 
SALARY attribute of all tuples in the Salary data table by 
twenty percent ( all will be updated since no tuple variable 
mee indicated). The SELECT command will be performed next 
and the two attribute variables, "S" and "D" will be used to 
qualify the desired tuples. The condition box is checked to 
determine the conditions placed on the attribute variables. 
Since the Select command is followed by the tuple variable, 
"X", all tuples which meet the criteria of the SELECT 
command will be assigned to this variable. These tuples 
will be the only ones from the Salary data table that will 
be used by the JOIN command. 

Now that all the table specific commands have been 
performed, the system then executes the binary command, 
JOIN. AS indicated by the "E" attribute variable, the join 
Bemeeperformed on the EMP NO and EMP ID attributes of the 
corresponding tables, and only the previously qualified 
tuples of each table will be joined together. MThe join 
performed iS an equijoin. 

Although this example is not very complicated, it 
Bueticiently demonstrates how one can utilize the 


capabilities to combine Query Mode commands in one 
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transaction to compose queries of varying degrees of 
complexity. It also illustrates’ the flexibility and 
simplicity the user is afforded with the Graphics Prototype 


Interface Data Manipulation Language. 


B. IMPLEMENTATION OVERVIEW 

In the conceptual design of the user interface, several 
intricate design implementation issueS and requirements 
arose. Due to the complexity of some of these issues and 
time consStraints, realization of the GPI as described has 
been precluded. However, the intent of the deSign was to 
present a model system on which to base implementation. In 
order to test the feasibility of the design, the GPI 
commands described in section A of this chapter were 
implemented (prototyped) by mapping them into Shell programs 
written in the C and SHELL programming languages in the UNIX 
Operating System environment. This prototyped 
implementation, although not as elegant as the proposed 
system Suggests, also serves to demonstrate the usefulness 
eeor! and ISS. 

In the following chapter the implementation Strategy 
used to effect the Prototype ISS is discussed, as well as 
the limitations of the strategy. Any limitation which gives 
rise to follow-on reSearch is pointed out in chapter VII, 


the conclusion. 
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VI. IMPLEMENTATION PROPOSALS 


This chapter proposes a high level design for an ISS 
implementation and a strategy for a very Simple prototype 
implementation using C Shell in the Unix environment. The 
prototype design is included to assist continuing research 
efforts to quickly implement a prototype system. When the 
basic system is working it may be evaluated, improved and 
expanded in future refinements of the overall design. This 
chapter is by no means a full design specification, as that 
task itself would cover much of the material for a separate 
masters thesis. The deSign proposal covers the ISS kernel 
main modules only, describing the overall hierarchy of the 
system. Subsequent refinements should include detailed 
module interfaces, file usage and data descriptions. 

Although not stated explicitly, the ISS itself is not a 
database management system (DBMS) and so its design does not 
include those issues handled by a DBMS. These include 
rollback and recovery, concurrency control and retrieval 
Strategies. It should be recognized that the lowest level 
modules in ISS, the primitive operators, will be highly 


dependent on the underlying DBMS or operating system. 
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Pee) 6S YSTEM DESIGN 

The basic Structure of ISS operates upon an underlying 
operating system, DBMS or file system for utilities, file 
manipulation and management services. Figure 6.1 shows this 


basic relationship. 


| OS/DBMS | 
| | 


| Database | 


Figure 6.1 =- Basic ISS Relationships 


Within the ISS there are five major levels which are 
shown in figure 6.2 and correspond to levels of operation o€ 
the system: system entry; command mode interpretation, 
command, other mode and applications execution; graphics and 
run time sequences; and at the lowest level the ISS 
primitive operators. The basic operators then translate 
their calls into the appropriate calls on the operating 
eyocem or DBMS. 

The system database in use will include the Schema 


Table, the Application Directories, the ISS Users Table, the 
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ISSRESULT and ISSDELETE Tables as described in chapter three 
of this thesis. In addition, a number of tables will be 
required to hold error and trap conditions, warnings and 
error messages, graphics data and probably other run time 
requirements. A number of System variables will need to be 
maintained to hold directory entries of current tables, 
their types, system status, user and security information 
and, of course, run time temporary variables. 

The following subsections briefly describe the major 
design modules and groups of modules in the system as 
depicted in figure 6.2. Groups of modules are shown in open 
boxes and represent Separate equal level modules but are 
compressed into one group for in order to prevent 
overcrowding the diagram. 

ime, antcy Module 

The Entry Module is responsible for invoking the ISS 
password procedure and identifying the prospective users. 
After a user is identified the entry module must perform the 
necessary initializations and then call the command module. 
If the password procedure fails after a predetermined number 
of times the system should take whatever action is 


determined appropriate. 
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2. Command mize cpreter 


The Command Interpreter is a Simple interpreter 
accepting commands or mode changes and calling the correct 
command execution module. After each command terminates the 
interpreter should reinstate the previous display. This may 
require direct calls to the GPI 

3. Help 

Help is a simple help facility which may be 
implemented in many different ways. Help is entered from 
any interactive environment by the entry of control H (“H) 
and returns control to the calling module. 

4. Query Mode 

The Query Mode is the module which allows entry of 
user queries into the skeletons described in chapter V. 
This modules draws heavily upon the services of the GPI 
package to allow on-screen editing of queries. The editing 
of the gueries has no execution significance within the 
query mode, it is simply an entry function. The query mode 
memernvoked by the entry of a control Q (“Q). 

Sa ol lerackage 

The Graphics Prototype Interface manages the 
graphics interface and storage of current queries. It is 
used by the Query Mode Module to change and enter queries 
and by the Query Parser to get the stored data. This module 
is highly dependent on the physical characteristics of the 


target system. 
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The Query Parser translates the queries in the 
skeleton form into a sequence of calls on the ISS operators 
and is invoked by the RUN command. It must retrieve the 
skeletons from the GPI Package, parse them, perform type 
checking for the issuance of warning of warnings and traps, 
check authorizations and privileges of the user, ask for 
verification on transactions which change the database, 
translate virtual tables to queries and, finally, execute 
the final transaction uSing a sequence of calls to the basic 
operators or return an appropriate error message. 

7. Other Commands 

Each of the other commands UNDO, SAVETO, DIRECTORY, 
TABLE, ERASE CREATE, DELETE and QUIT invokes a module to 
effect the command. The TABLE and ERASE commands must call 
on the GPI Package since they affect the current skeleton 
display of queries. The other commands may produce output 
and temporarily displace the current query image but are 
themselves overwritten by a redisplay of the current 
skeletons after execution. Some commands such as SAVETO 
will need to make direct‘calls to the underlying operating 
system or DBMS to perform functions. 

8. Application Programs 

Each of the logical applications is invoked from the 

command interpreter module and is enclosed in its own 


module. The applications may make calls to the operators 
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and the operating system to achieve their objectives, but 
are in reality very much like a view on the database. It is 
the single underlying database with its set of operations 


which is at the heart of the integration of the system. 


B. PROTOTYPE IMPLEMENTATION STRATEGY 

In any system design a prototype implementation is often 
useful to determine unforeseen shortfalls and difficulties 
of a proposed system. This section is intended as a brief 
description of a strategy for implementing a simple ISS 
prototype kernel using C Shell running under Unix on a Vax 
11/780 system currently operating at the Naval Postgraduate 
School, Monterey. AS with many prototypes, efficiency, 
completeness and absolute adherence to the defined system is 
not intended, instead it represents the first step toward 
the Unix environment philosophy to "get something small 
working aS soon as possible". (Ref. 14] 

The target display is a simple 24 by 80 column display 
with Unix as the underlying operating system with its 
standard tools and file services. The following subsections 
describe the overall strategy, the database and a simple 
"graphics" implementation. The descriptions assume a 
knowledge Oe the C Shell interpreter and the Unix 


programming environment. 
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Each module described in section A may be 
implemented as a C Shell script, calling on lower level 
scripts and Unix services when required. Tasks which are 
difficult to achieve using the C Shell may be written as C 
routines and then added to the standard set of Unix tools. 
One such example is the requirement for a utility to perform 
set difference on unsorted files. A simple C program may be 
written to perform this function, albeit inefficiently, and 
then called in the same way as normal Unix commands. 

Many of the Unix commands act as filters, acting on 
Standard input or specified files and send results to the 
Standard output. Unix's ability to redirect the standard 
input and output provides powerful file manipulation 
facilities, and since the majority of tasks in the ISS 
kernel are file manipulations, this environment is ideal. 
Figure 6.3 shows a very simple minded example of a= shell 
script for the DELETE operator using the Unix Awk pattern 
program and redirection of the results into the desired 
tables. Awk operates as a filter reading the whole file and 
performing the actions on each line in this case printing to 
the desired file depending on the value of the condition for 
the line. ( The $1 and $2 in the Awk command lines are 
expanded to the parameters of the call to the delete 


Berirpt.) 
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Operations on files may be implemented by using 
filters to process input files into temporary files and 
result files. Processing can continue through a series of 


filters until the desired result is obtained. 


Simple ISS DELETE operator script using 

the Unix filter Awk. The script is called 
with two parameters, the first a file name 
and the second a string which 1s a correct 
Awk field matching expression. Awk sends 
its results to the standard output but this 
script redirects that output to the file 
ISSRESULT and then again to Standard output. 
(The script calling this may redirect the 
result of this call to some other file!) 
Usage: delete file Me Ond LELOn - 


st He che ste SHE HE SHE He He He 


awk -e ' BEGIN {FS = ":"\ 
OFS = F$\ 
{\ 


eee ole ie PElnt os LoSDELETE” }\ 
else { print }\ 
pe Slo 2) SSeS Uy 

eat LSSRESULT # To standard output 

# End script delete 


Figure 6.3 = Sample Delete Script 


Variables which need to be passed down to lower 
level scripts may be placed in the environment or passed as 
explicit parameters of the call. Values can be returned 
from lower levels by placing results in the standard output 
while the calling routine redirects standard output to a 
file or places it in a variable. This arrangement provides 
flexibility, since the calling routines have the freedom. to 


Place results, which may be entire files, wherever they 
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choose. This essentially is the philosophy of the Unix 
filter, .coupled with the freedom to redirect input and 
output. The set of ISS operators can be realized by using 
the Unix tools such as sed, Grep, Awk and Sort, and writing 
C routines to perform other low level routines. 

2. Data Base 

Unix files are Simply sequences of characters 
organized in a hierarchical naming structure. Some Unix 
commands provide field discrimination by the identification 
of fields separators and thus for a first prototype it would 
be best to use this facility. System and user tables can be 
files with embedded file separators. The choice of 
separators should be defined as a variable and placed in the 
environment by the top level module. The use of field 
Separators removes the need to be concerned about field 
lengths, therefore it is recommended that the prototype 
dispense with field length checks altogether, effectively 
making all fields of varying length. 

The ID field should be explicitly included in every 
file, although it is not clear if this is necessary or 
whether, when IDs are required for a data manipulation, the 
builtin line counters of the Unix filters Awk and Sed are 
sufficient to meet all needs. This uncertainty applies 
particularly to the applications which use sorted files. In 
any case, care must be taken to enSure line IDS are removed 


before attempting line comparisons in the UNION, DIFFERENCE 
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and INTERSECT operators, and resultant tables have all lines 
-renumbered contiguously. The tasks to strip the ID field 
from a table and renumber a stripped table or result are 
ideal tasks for C programs written as filters. 

The Unix files comprising the system should occupy 
their own set of subedirectories. It 1s suggested that the 
Sangie top level entry module called "iss" occupy a 
directory as the only file with two sub-directories: "bin" 
to hold all the executable scripts and object code of the C 
programs in the system, and "data" to hold all the data 
files for the system. The top level entry module may then 
include the path to iss/bin in its path length and rehash to 
allow access to the commands therein while executing the iss 
and deeper level scripts. A third sub-directory may be used 
to store all source programs for the entire system. 

feeeoesimple GPI Package 

A very simple query entry Bac lt Gy may be 
implemented using the Unix display editor Vi. It is 
stressed that this is not a true graphics implementation, 
but a simple measure for the first prototype. The essence 
is that the display is simply a file which is typed to the 
screen for display, or edited using Vi to enter queries. 
The screen file may be initially created by combining in 
sequence three template files holding the empty skeletons 
for the condition, and the selected tables. The templates 


may be made up for each of the fixed structure types and for 
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every individual database data table. Since the coples of 
the templates are a fixed number of lines in the screen 
file, the screen file may be split up into the original 
components for parsing or for replacement of part of the 
file by a TABLE or ERASE command. Entry to the query mode 
is then a simple call to Vi to edit the screen file. 

This query implementation 1S proposed because it is 
very Simple and easy to implement. It does however, place a 
burden on the user to uSe Vi to enter his queries, not 
disturbing the number of lines in the file or the table 
skeleton structure. 

Parsing can be done by searching for the strings’ in 
between the table column boundaries ( the character |) line 
by line from the bottom up. Parsing should be relatively 
Simple because of the simple rules of the GPI, that is: each 
operator operates on the current table, or if it is’ linked 
to a lower operator by a tuple variable it is applied to 
TSSRESULT. 

The use of the method outlined will require the 
Create command to produce the appropriate skeletons for new 
database tables and the command parser to produce _ the 
template for ISSRESULT each time it is created. 

4. Limitations 

An implementation as described in this section will 

provide a very basic ISS kernel which may be refined and 


improved in subsequent iterations of the project. The major 
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himitation is the graphics interface which will be 
particularly error prone if editing is not cone Somrecely. 
The skeleton templates will be the definition driving the 
field definition during parsing rather than the Schema as 
should be the case, hence if the tables are changed during 
query entry there will most likely be unpredictable behavior 
by the system. 

Manipulation of files using filters will be very 
slow, especially on medium to long files. Although it is 
much easier to implement the basic operators this way, files 
will need to be filtered three or more times for some 
operations rather than the direct access methods available 
uSing a proper DBMS. This will involve a high overhead of 
processing as well as the overhead of the I/0 involved in 
creating and using temporary files. Interpretation by the C 
Shell will further slow down execution, thus on the whole it 
will be unacceptably slow for any practical application. 

The use of field Separators and variable length 
fields will cause tables not to appear as "neat" tables but 
instead as untidy lines of uneven length. 

True security will not be possible since the Unix 
Operating System is not itself secure. 

Undoubtably, such an unsophisticated implementation 
will have many other limitations not described here, however 
the intention is to provide some guidance for future 


research and prototype implementation. 
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VII. CONCLUSION 


The results of this thesis research more than 
sufficiently demonstrate that the Relational Database model 
can indeed be used as the underlying model to conceptually 
integrate the five application areas of interest: text 
processing, form generation,database, spreadsheet modeling, 
and electronic mail. The integration of the common 
functions of these applications into one ISS kernel which 
uses a relation expressed in table format as the single data 
object and which uses a small vocabulary of commands is not 
only feasible, but also of high utility. 

The realization of all of the objectives stated in 
chapter one support the aforementioned claim and serve as 
the basis for the findings discussed in section A of this 
Snapter. Section B presents a brief discussion of related 


research issues which are suggested for follow-on research. 


A. FINDINGS 

One of the primary objectives of this thesis was to 
perform a detailed evaluation and analysis of the conceptual 
level of the design proposed in Nishimura's thesis [Ref. 1]. 
This included a rerevaluation of the logical databases 
(tables) designed for the ISS and resulted in the design of 


Supportive ISS system tables which could also be used to 
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incorporate multilevel security in the system. Another 
Significant change made in the design of the tables was the 
combination of the five application data table schema tables 
into one common Schema Table which is owned and maintained 
by the system. It was found that this method iS more 
efficient than maintaining separate schema tables for each 
application and more truly reflects the idea of an 
integrated system. 

Another major objective was to carefully select or 
revise the basic primitives needed to form the kernel 
command vocabulary. It was found that’ the primitive 
operators, Modify,Insert, Delete, Project, Select, Sort, and 
Concatenate, and the combining operators, Join, Union, Set 
Difference, Intersection, and Natural Join, form a complete 
set of operators necessary to perform all operations common 
and desirable to each specific application area. This set 
of primitive and combining operators coupled with the 
ability to combine data tables of different types prove to 
be quite useful in expanding the expressive power of the 
system. 

In addition to refining the original thesis [Ref. 1], 
major efforts went towards expanding it to include the 
design of a User Interface to ISS and to propose an 
Implementation Strategy for the system. This extension 
resulted in the design OE GPI (Graphics Prototype 


Interface), a Data Manipulation Language which enables the 
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user to use QBErlike simple graphics to easily express 
queries of varying degrees of complexity. Auiweof the 
primitive and combining operators are mapped into GPI 
commands and the details of their implementation are 
completely hidden from the user. Therefore, any subsequent 
change to the GPI Language can be made without having to 
completely redesign the system. AlSo, by abstracting the 
implementation details from the User Interface, GPI lends 
itself quite well to even the most naive user who has_~ some 
basic background with database query languages. 

The Implementation Strategy proposed herein suggests 
that the UNIX Operating System Environment ( including the 
Shell and C programming languages) can be used to implement 
Emeprototype of the ISS. The feasibility of this suggestion 
was tested and proved by implementing crude prototypes of 
the basic primitives and combining operators. 

Having revised and extended the proposal of Nishimura's 
thesis (Ref. 1) down to the level of Implementation 
Strategy, we feel that we have strongly demonstrated that 
using the Relational Database model in effecting the 
Integrated Software System is highly feasible and useful. 
Based on these findings it 1S Suggested that furtherance of 
this research topic to the point of the completion of the 


implementation of the proposed design is desirable. 
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B. FOLLOW=-ON RESEARCH 

During the design of the ISS several issues arose which 
gave rise to follow-on research ideas. Some of these were 
viewed as shortfalls or limitations in the proposed design. 
Others, such as Database Management System issues were 
considered to be outside the scope of this’ thesis. These 
and other related follow-on research ideas are discussed in 
the subsequent paragraphs. 

The current design of the ISS does not address the issue 
of efficiency since the main concern was_ to prove the 
feasibility of the proposal. Further iterations of this 
Same topic should address efficiency of the design as it 
impacts on the physical implementation of the conceptual 
design. 

This iteration of the research focuses only on the 
design of the kernel of the ISS, omitting efforts related to 
the design of the non-integratable features unique to the 
individual application areas. The conceptual design and 
implementation of the application specific commands’ and 
viewS need to be developed. It is recommended that the 
framework of the design not deviate highly from that 
Suggested for the kernel. Also, in some cases a Subset of 
the five applications may need to perform the same or 
Similar functions. In other words, displaying a spreadsheet 
view is quite similar to displaying a view of a = form. In 


order to continue with the premise that the total ISS 
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command vocabulary be small, it 1S Suggested that "sub- 
kernel" commands be developed to support those subsets of 
applications that have similar functions that are not 
applicable to all five applications. 

The Implementation Strategy proposed addresses simple 
graphics issues, i.e., the drawing of tables on the CRT. 
However, it is recognized that a much more sophisticated 
Graphics Package is necessary to effect the High Level 
System Graphics commands proposed in the design of the User 
Interface. Such issues as widening columns’) during the 
insert mode to accommodate the input of variable length data 
attributes or displaying the complete field of a variable 
mength attribute during the display mode should be 
incorporated into the design. 

One further recommendation which is directly related to 
the current design is that the Physical Level of the design 
Should be carefully studied. This will probably lead to a 
revision of some of the features presented in the conceptual 
design. 

As stated before, the ISS system is based on the 
Relational Database model but is not, within itself, a 
Database Management System (DBMS), therefore, many of the 
issues that would be handled by the DBMS are not presented 
within this thesis. Nonetheless, it is recognized that in 
order for the ISS to be fully implemented and functional in 


a concurrent multi-user environment, the underlying DBMS 
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must be designed. This would include the research and 
Besign Of [5S Security. Although this version of ISS does 
ho t implement system security, it provides the basic 
framework for incorporating multilevel security. It is 
Suggested that this framework be reviewed and modified as 
necessary to Support actual implementation of a secure ISS 
package. 

Another DBMS research effort which lends itself to the 
Current proposal is that of transaction recovery. This 
issue was considered briefly with the inclusion of the UNDO 
system command which enables the user to reverse the effects 
of the last committed transaction. Of course this is a 
Simplistic view of a highly complex problem and iit is 
Suggested that research in this area constitute a complete 
thesis topic. 

Both the non=DBMS and the DBMS issues discussed herein 
are viewed as being important and necessary research 
follow-on items. Furtherance of research on the suggested 
topics would certainly provide tangible benefits and lead to 
the completion of the design and implementation of ISS, the 


Integrated Software System. 


Jey 
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